| | |
Last updated on October 1, 2023. This conference program is tentative and subject to change
Technical Program for Monday October 2, 2023
| |
| MoAT1 Regular session, 140A |
Add to My Program |
| Semantic Scene Understanding |
|
| |
| Chair: Weiland, James | University of Michigan |
| Co-Chair: Simonin, Olivier | INSA De Lyon |
| |
| 08:30-08:36, Paper MoAT1.1 | Add to My Program |
| Gaussian Radar Transformer for Semantic Segmentation in Noisy Radar Data |
|
| Zeller, Matthias | CARIAD SE |
| Behley, Jens | University of Bonn |
| Heidingsfeld, Michael | CARIAD SE |
| Stachniss, Cyrill | University of Bonn |
Keywords: Semantic Scene Understanding, Deep Learning Methods
Abstract: Scene understanding is crucial for autonomous robots in dynamic environments for making future state predictions, avoiding collisions, and path planning. Camera and LiDAR perception made tremendous progress in recent years, but face limitations under adverse weather conditions. To leverage the full potential of multi-modal sensor suites, radar sensors are essential for safety critical tasks and are already installed in most new vehicles today. In this paper, we address the problem of semantic segmentation of moving objects in radar point clouds to enhance the perception of the environment with another sensor modality. Instead of aggregating multiple scans to densify the point clouds, we propose a novel approach based on the self-attention mechanism to accurately perform sparse, single-scan segmentation. Our approach, called Gaussian Radar Transformer, includes the newly introduced Gaussian transformer layer, which replaces the softmax normalization by a Gaussian function to decouple the contribution of individual points. To tackle the challenge of the transformer to capture long-range dependencies, we propose our attentive up- and downsampling modules to enlarge the receptive field and capture strong spatial relations. We compare our approach to other state-of-the-art methods on the RadarScenes data set and show superior segmentation quality in diverse environments, even without exploiting temporal information.
|
| |
| 08:36-08:42, Paper MoAT1.2 | Add to My Program |
| Mask-Based Panoptic LiDAR Segmentation for Autonomous Driving |
|
| Marcuzzi, Rodrigo | University of Bonn |
| Nunes, Lucas | University of Bonn |
| Wiesmann, Louis | University of Bonn |
| Behley, Jens | University of Bonn |
| Stachniss, Cyrill | University of Bonn |
Keywords: Semantic Scene Understanding, Deep Learning Methods
Abstract: Autonomous vehicles need to understand their surroundings geometrically and semantically to plan and act appropriately in the real world. Panoptic segmentation of LiDAR scans provides a description of the surroundings by unifying semantic and instance segmentation. It is usually solved in a bottom-up manner, consisting of two steps. Predicting the semantic class for 3D each point, using this information to filter out �stuff� points, and cluster the �thing� points to obtain instance segmentation. The clustering is a post-processing step that often needs hyperparameter tuning, which usually does not adapt to instances of different sizes or different datasets. To this end, we propose MaskPLS, an approach to perform panoptic segmentation of LiDAR scans in an end-to-end manner by predicting a set of non-overlapping binary masks and semantic classes, fully avoiding the clustering step. As a result, each mask represents a single instance belonging to a �thing� class or a complete �stuff� class. Experiments on SemanticKITTI show that the end-to-end learnable mask generation leads to superior performance compared to state-of-the-art heuristic approaches.
|
| |
| 08:42-08:48, Paper MoAT1.3 | Add to My Program |
| SCENE: Reasoning about Traffic Scenes Using Heterogeneous Graph Neural Networks |
|
| Schmidt, Julian | Mercedes-Benz AG, Ulm University |
| Monninger, Thomas | Mercedes-Benz AG, University of Stuttgart |
| Rupprecht, Jan | Mercedes-Benz AG |
| Raba, David | Mercedes Benz AG |
| Jordan, Julian | Mercedes-Benz AG |
| Frank, Daniel | University of Stuttgart |
| Staab, Steffen | University of Stuttgart |
| Dietmayer, Klaus | University of Ulm |
Keywords: Semantic Scene Understanding, AI-Based Methods, Behavior-Based Systems
Abstract: Understanding traffic scenes requires considering heterogeneous information about dynamic agents and the static infrastructure. In this work we propose SCENE, a methodology to encode diverse traffic scenes in heterogeneous graphs and to reason about these graphs using a heterogeneous Graph Neural Network encoder and task-specific decoders. The heterogeneous graphs, whose structures are defined by an ontology, consist of different nodes with type-specific node features and different relations with type-specific edge features. In order to exploit all the information given by these graphs, we propose to use cascaded layers of graph convolution. The result is an encoding of the scene. Task-specific decoders can be applied to predict desired attributes of the scene. Extensive evaluation on two diverse binary node classification tasks show the main strength of this methodology: despite being generic, it even manages to outperform task-specific baselines. The further application of our methodology to the task of node classification in various knowledge graphs shows its transferability to other domains.
|
| |
| 08:48-08:54, Paper MoAT1.4 | Add to My Program |
| Prototypical Contrastive Transfer Learning for Multimodal Language Understanding |
|
| Otsuki, Seitaro | Keio University |
| Ishikawa, Shintaro | Keio University |
| Sugiura, Komei | Keio University |
Keywords: Transfer Learning, Semantic Scene Understanding, Multi-Modal Perception for HRI
Abstract: Although domestic service robots are expected to assist individuals who require support, they cannot currently interact smoothly with people through natural language. For example, given the instruction "Bring me a bottle from the kitchen," it is difficult for such robots to specify the bottle in an indoor environment. Most conventional models have been trained on real-world datasets that are labor-intensive to collect, and they have not fully leveraged simulation data through a transfer learning framework. In this study, we propose a novel transfer learning approach for multimodal language understanding called Prototypical Contrastive Transfer Learning (PCTL), which uses a new contrastive loss called Dual ProtoNCE. We introduce PCTL to the task of identifying target objects in domestic environments according to free-form natural language instructions. To validate PCTL, we built new real-world and simulation datasets. Our experiment demonstrated that PCTL outperformed existing methods. Specifically, PCTL achieved an accuracy of 78.1%, whereas simple fine-tuning achieved an accuracy of 73.4%.
|
| |
| 08:54-09:00, Paper MoAT1.5 | Add to My Program |
| Re-Thinking Classification Confidence with Model Quality Quantification |
|
| Pan, Yancheng | Peking University |
| Zhao, Huijing | Peking University |
Keywords: Semantic Scene Understanding, Autonomous Agents
Abstract: Deep neural networks using for real-world classification task require high reliability and robustness. However, the Softmax output by the last layer of network is often over-confident. We propose a novel confidence estimation method by considering model quality for deep classification models. Two metrics, MQ-Repres and MQ-Discri are developed accordingly to evaluate the model quality, and also provide a new confidence estimation called MQ-Conf for online inference. We demonstrate the capability of the proposed method by the 3D semantic segmentation tasks using three different deep networks. Through confusion analysis and feature visualization we show the rationality and reliability of the model quality quantification method.
|
| |
| 09:00-09:06, Paper MoAT1.6 | Add to My Program |
| Self-Supervised Drivable Area Segmentation Using LiDAR�s Depth Information for Autonomous Driving |
|
| Ma, Fulong | The Hong Kong University of Science and Technology |
| Liu, Yang | The Hong Kong University of Science and Technology |
| Wang, Sheng | Hong Kong University of Science and Technology |
| Jin, Wu | UESTC |
| Qi, Weiqing | HKUST |
| Liu, Ming | Hong Kong University of Science and Technology |
Keywords: Semantic Scene Understanding, Perception for Grasping and Manipulation, Mapping
Abstract: Drivable area segmentation is an essential component of the visual perception system for autonomous driving vehicles. Recent efforts in deep neural networks have significantly improved semantic segmentation performance for autonomous driving. However, most DNN-based methods need a large amount of data to train the models, and collecting large-scale datasets with manually labeled ground truth is costly, tedious, time consuming and requires the availability of experts, making DNN-based methods often difficult to implement in real world applications. Hence, in this paper, we introduce a novel module named automatic data labeler (ADL), which leverages a deterministic LiDAR-based method for ground plane segmentation and road boundary detection to create large datasets suitable for training DNNs. Furthermore, since the data generated by our ADL module is not as accurate as the manually annotated data, we introduce uncertainty estimation to compensate for the gap between the human labeler and our ADL. Finally, we train the semantic segmentation neural networks using our automatically generated labels on the KITTI dataset and KITTI-CARLA dataset. The experimental results demonstrate that our proposed ADL method not only achieves impressive performance compared to manual labeling but also exhibits more robust and accurate results than both traditional methods and state-of-the-art self-supervised methods.
|
| |
| 09:06-09:12, Paper MoAT1.7 | Add to My Program |
| Vehicle Motion Forecasting Using Prior Information and Semantic-Assisted Occupancy Grid Maps |
|
| Asghar, Rabbia | INRIA / Univ. Grenoble Alpes |
| Diaz-Zapata, Manuel | Inria Grenoble |
| Rummelhard, Lukas | INRIA |
| Spalanzani, Anne | INRIA / Univ. Grenoble Alpes |
| Laugier, Christian | INRIA |
Keywords: Semantic Scene Understanding, Deep Learning Methods, Autonomous Vehicle Navigation
Abstract: Motion prediction is a challenging task for autonomous vehicles due to uncertainty in the sensor data, the non-deterministic nature of future, and complex behavior of agents. In this paper, we tackle this problem by representing the scene as dynamic occupancy grid maps (DOGMs), associating semantic labels to the occupied cells and incorporating map information. We propose a novel framework that combines deep-learning-based spatio-temporal and probabilistic approaches to predict multimodal vehicle behaviors. Contrary to the conventional OGM prediction methods, evaluation of our work is conducted against the ground truth annotations. We experiment and validate our results on real-world NuScenes dataset and show that our model shows superior ability to predict both static and dynamic vehicles compared to OGM predictions. Furthermore, we perform an ablation study and assess the role of semantic labels and map in the architecture.
|
| |
| 09:12-09:18, Paper MoAT1.8 | Add to My Program |
| Enhance Local Feature Consistency with Structure Similarity Loss for 3D Semantic Segmentation |
|
| Lin, Cheng-Wei | Department of Computer Science, National Yang Ming Chiao Tung Un |
| Syu, Fang-Yu | Department of Computer Science, National Yang Ming Chiao Tung Un |
| Pan, Yi-Ju | National Yang Ming Chiao Tung University |
| Chen, Kuan-Wen | National Yang Ming Chiao Tung University |
Keywords: Semantic Scene Understanding, Object Detection, Segmentation and Categorization, Deep Learning for Visual Perception
Abstract: Recently, many research studies have been carried out on using deep learning methods for 3D point cloud understanding. However, there is still no remarkable result on 3D point cloud semantic segmentation compared to those of 2D research. One important reason is that 3D data has higher dimensionality but lacks large datasets, which means that the deep learning model is difficult to optimize and easy to overfit. To overcome this, an essential method is to provide more priors to the learning of deep models. In this paper, we focus on semantic segmentation for point clouds in the real world. To provide priors to the model, we propose a novel loss function called Linearity and Planarity to enhance local feature consistency in the regions with similar local structure. Experiments show that the proposed method improves baseline performance on both indoor and outdoor datasets e.g. S3DIS and Semantic3D.
|
| |
| 09:18-09:24, Paper MoAT1.9 | Add to My Program |
| Lightweight Semantic Segmentation Network for Semantic Scene Understanding on Low-Compute Devices |
|
| Son, Hojun | University of Michigan |
| Weiland, James | University of Michigan |
Keywords: Semantic Scene Understanding, Embedded Systems for Robotic and Automation, Deep Learning for Visual Perception
Abstract: Semantic scene understanding is beneficial for mobile robots. Semantic information obtained through onboard cameras can improve robots� navigation performance. However, obtaining semantic information on small mobile robots with constrained power and computation resources is challenging. We propose a new lightweight convolution neural network comparable to previous semantic segmentation algorithms for mobile applications. Our network achieved 73.06% on the Cityscapes validation set and 71.8% on the Cityscapes test set. Our model runs at 116 FPS with 1024x2048, 172 fps with 1024x1024, and 175 FPS with 720x960 on NVIDIA GTX 1080. We analyze a model size, which is defined as the summation of the number of floating operations and the number of parameters. The smaller model size enables tiny mobile robot systems that should operate multiple tasks simultaneously to work efficiently. Our model has the smallest model size compared to the real-time semantic segmentation convolution neural networks ranked on Cityscapes real-time benchmark and other high-performing, lightweight convolution neural networks. On the Camvid test set, our model achieved a mIoU of 73.29% with Cityscapes pre-training, which outperformed the accuracy of other lightweight convolution neural networks. For mobile applicability, we measured frame-per-second on different low-compute devices. Our model operates 35 FPS on Jetson Xavier AGX, 21 FPS on Jetson Xavier NX, and 14 FPS on a ROS ASUS gaming phone. 1024x2048 resolution is used for the Jetson devices, and 512x512 size is utilized for the measurement on the phone. Our network did not use extra datasets such as ImageNet, Coarse Cityscapes, and Mapillary. Additionally, we did not use TensorRT to achieve fast inference speed. Compared to other real-time and lightweight CNNs, our model achieved significantly more efficiency while balancing accuracy, inference speed, and model size.
|
| |
| 09:24-09:30, Paper MoAT1.10 | Add to My Program |
| LiDAR-SGMOS: Semantics-Guided Moving Object Segmentation with 3D LiDAR |
|
| Gu, Shuo | Nanjing University of Science and Technology |
| Yao, Suling | Nanjing University of Science and Technology |
| Yang, Jian | Nanjing University of Science & Technology |
| Xu, Chengzhong | University of Macau |
| Kong, Hui | University of Macau |
Keywords: Semantic Scene Understanding, Object Detection, Segmentation and Categorization, Deep Learning Methods
Abstract: Most of the existing moving object segmentation (MOS) methods regard MOS as an independent task, in this paper, we associate the MOS task with semantic segmentation, and propose a semantics-guided network for moving object segmentation (LiDAR-SGMOS). We first transform the range image and semantic features of the past scan into the range view of current scan based on the relative pose between scans. The residual image is obtained by calculating the normalized absolute difference between the current and transformed range images. Then, we apply a Meta-Kernel based cross scan fusion (CSF) module to adaptively fuse the range images and semantic features of current scan, the residual image and transformed features. Finally, the fused features with rich motion and semantic information are processed to obtain reliable MOS results. We also introduce a residual image augmentation method to further improve the MOS performance. Our method outperforms most LiDAR-MOS methods with only two sequential LiDAR scans as inputs on the SemanticKITTI MOS dataset.
|
| |
| 09:30-09:36, Paper MoAT1.11 | Add to My Program |
| Robust Fusion for Bayesian Semantic Mapping |
|
| Morilla-Cabello, David | Universidad De Zaragoza |
| Mur Labadia, Lorenzo | University of Zaragoza |
| Martinez-Cantin, Ruben | University of Zaragoza |
| Montijano, Eduardo | Universidad De Zaragoza |
Keywords: Semantic Scene Understanding, Mapping, Deep Learning for Visual Perception
Abstract: The integration of semantic information in a map allows robots to understand better their environment and make high-level decisions. In the last few years, neural networks have shown enormous progress in their perception capabilities. However, when fusing multiple observations from a neural network in a semantic map, its inherent overconfidence with unknown data gives too much weight to the outliers and decreases the robustness. To mitigate this issue we propose a novel robust fusion method to combine multiple Bayesian semantic predictions. Our method uses the uncertainty estimation provided by a Bayesian neural network to calibrate the way in which the measurements are fused. This is done by regularizing the observations to mitigate the problem of overconfident outlier predictions and using the epistemic uncertainty to weigh their influence in the fusion, resulting in a different formulation of the probability distributions. We validate our robust fusion strategy by performing experiments on photo-realistic simulated environments and real scenes. In both cases, we use a network trained on different data to expose the model to varying data distributions. The results show that considering the model's uncertainty and regularizing the probability distribution of the observations distribution results in a better semantic segmentation performance and more robustness to outliers, compared with other methods.
|
| |
| 09:36-09:42, Paper MoAT1.12 | Add to My Program |
| ConSOR: A Context-Aware Semantic Object Rearrangement Framework for Partially Arranged Scenes |
|
| Ramachandruni, Kartik | Georgia Institute of Technology |
| Zuo, Max | Georgia Institute of Technology |
| Chernova, Sonia | Georgia Institute of Technology |
Keywords: Semantic Scene Understanding, Deep Learning Methods
Abstract: Object rearrangement is the problem of enabling a robot to identify the correct object placement in a complex environment. Prior work on object rearrangement has explored a diverse set of techniques for following user instructions to achieve some desired goal state. Logical predicates, images of the goal scene, and natural language descriptions have all been used to instruct a robot in how to arrange objects. In this work, we argue that burdening the user with specifying goal scenes is not necessary in partially-arranged environments, such as common household settings. Instead, we show that contextual cues from partially arranged scenes (i.e., the placement of some number of pre-arranged objects in the environment) provide sufficient context to enable robots to perform object rearrangement without any explicit user goal specification. We introduce ConSOR, a Context-aware Semantic Object Rearrangement framework that utilizes contextual cues from a partially arranged initial state of the environment to complete the arrangement of new objects, without explicit goal specification from the user. We demonstrate that ConSOR strongly outperforms two baselines in generalizing to novel object arrangements and unseen object categories. The code and data are available at https://github.com/kartikvrama/consor.
|
| |
| 09:42-09:48, Paper MoAT1.13 | Add to My Program |
| IDA: Informed Domain Adaptive Semantic Segmentation |
|
| Chen, Zheng | Indiana University Bloomington |
| Ding, Zhengming | Tulane University |
| Gregory, Jason M. | US Army Research Laboratory |
| Liu, Lantao | Indiana University |
Keywords: Semantic Scene Understanding, Deep Learning for Visual Perception, Object Detection, Segmentation and Categorization
Abstract: Mixup-based data augmentation has been validated to be a critical stage in the self-training framework for unsupervised domain adaptive semantic segmentation (UDA-SS), which aims to transfer knowledge from a well-annotated (source) domain to an unlabeled (target) domain. Existing self-training methods usually adopt the popular region-based mixup techniques with a random sampling strategy, which unfortunately ignores the dynamic evolution of different semantics across various domains as training proceeds. To improve the UDA-SS performance, we propose an Informed Domain Adaptation (IDA) model, a self-training framework that mixes the data based on class-level segmentation performance, which aims to emphasize small-region semantics during mixup. In our IDA model, the class-level performance is tracked by an expected confidence score (ECS). We then use a dynamic schedule to determine the mixing ratio for data in different domains. Extensive experimental results reveal that our proposed method is able to outperform the state-of-the-art UDA-SS method by a margin of 1.1 mIoU in the adaptation of GTA-V to Cityscapes and of 0.9 mIoU in the adaptation of SYNTHIA to Cityscapes.
|
| |
| 09:48-09:54, Paper MoAT1.14 | Add to My Program |
| Self-Supervised Learning for Panoptic Segmentation of Multiple Fruit Flower Species |
|
| Siddique, Abubakar | Marquette University |
| Tabb, Amy | USDA-ARS-AFRS |
| Medeiros, Henry | University of Florida |
Keywords: Semantic Scene Understanding, Object Detection, Segmentation and Categorization, Incremental Learning
Abstract: Convolutional neural networks trained using manually generated labels are commonly used for semantic or instance segmentation. In precision agriculture, automated flower detection methods use supervised models and post-processing techniques that may not perform consistently as the appearance of the flowers and the data acquisition conditions vary. We propose a self-supervised learning strategy to enhance the sensitivity of segmentation models to different flower species using automatically generated pseudo-labels. We employ a data augmentation and refinement approach to improve the accuracy of the model predictions. The augmented semantic predictions are then converted to panoptic pseudo-labels to iteratively train a multi-task model. The self-supervised model predictions can be refined with existing post-processing approaches to further improve their accuracy. An evaluation on a multi-species fruit tree flower dataset demonstrates that our method outperforms state-of-the-art models without computationally expensive post-processing steps, providing a new baseline for flower detection applications.
|
| |
| MoAT2 Regular session, 140B |
Add to My Program |
| Wearable and Assistive Devices |
|
| |
| Chair: Audu, Musa. L. | Case Western Reserve University |
| Co-Chair: Kong, Kyoungchul | Korea Advanced Institute of Science and Technology |
| |
| 08:30-08:36, Paper MoAT2.1 | Add to My Program |
| Combined Admittance Control with Type II Singularity Evasion for Parallel Robots Using Dynamic Movement Primitives (I) |
|
| Escarabajal, Rafael J. | Universidad Polit�cnica De Valencia |
| Pulloquinga, Jos� Luis | Universidad Polit�cnica De Valencia |
| Valera, Angel | Universidad Polit�cnica De Valencia |
| Mata, Vicente | Universidad Polit�cnica De Valencia |
| Valles, Marina | Universitat Polit�cnica De Val�ncia |
| Castillo-Garc�a, Fernando J. | Universidad De Castilla-La Mancha |
Keywords: Rehabilitation Robotics, Parallel Robots, Compliance and Impedance Control, Dynamic Movement Primitives
Abstract: This paper addresses a new way of generating compliant trajectories for control using movement primitives to allow physical human-robot interaction where parallel robots (PRs) are involved. PRs are suitable for tasks requiring precision and performance because of their robust behavior. However, two fundamental issues must be resolved to ensure safe operation: i) the force exerted on the human must be controlled and limited, and ii) Type II singularities should be avoided to keep complete control of the robot. We offer a unified solution under the Dynamic Movement Primitives (DMP) framework to tackle both tasks simultaneously. DMPs are used to get an abstract representation for movement generation and are involved in broad areas such as imitation learning and movement recognition. For force control, we design an admittance controller intrinsically defined within the DMP structure, and subsequently, the Type II singularity evasion layer is added to the system. Both the admittance controller and the evader exploit the dynamic behavior of the DMP and its properties related to invariance and temporal coupling, and the whole system is deployed in a real PR meant for knee rehabilitation. The results show the capability of the system to perform safe rehabilitation exercises.
|
| |
| 08:36-08:42, Paper MoAT2.2 | Add to My Program |
| A Handle Robot for Providing Bodily Support to Elderly Persons |
|
| Bolli, Roberto | MIT |
| Bonato, Paolo | Harvard Medical School |
| Asada, Harry | MIT |
Keywords: Physically Assistive Devices, Human-Robot Collaboration, Domestic Robotics
Abstract: Age-related loss of mobility and an increased risk of falling remain major obstacles for older adults to live independently. Many elderly people lack the coordination and strength necessary to perform activities of daily living, such as getting out of bed or stepping into a bathtub. A traditional solution is to install grab bars around the home. For assisting in bathtub transitions, grab bars are fixed to a bathroom wall. However, they are often too far to reach and stably support the user; the installation locations of grab bars are constrained by the room layout and are often suboptimal. In this paper, we present a mobile robot that provides an older adult with a handlebar located anywhere in space - �Handle Anywhere�. The robot consists of an omnidirectional mobile base attached to a repositionable handlebar. We further develop a methodology to optimally place the handle to provide the maximum support for the elderly user while performing common postural changes. A cost function with a trade-off between mechanical advantage and manipulability of the user�s arm was optimized in terms of the location of the handlebar relative to the user. The methodology requires only a sagittal plane video of the elderly user performing the postural change, and thus is rapid, scalable, and uniquely customizable to each user. A proof-of-concept prototype was built, and the optimization algorithm for handle location was validated experimentally.
|
| |
| 08:42-08:48, Paper MoAT2.3 | Add to My Program |
| A Hybrid FNS Generator for Human Trunk Posture Control with Incomplete Knowledge of Neuromusculoskeletal Dynamics |
|
| Bao, Xuefeng | Case Western Reserve University |
| Friederich, Aidan | Case Western Reserve University |
| Triolo, Ronald | Case Western Reserve University |
| Audu, Musa. L. | Case Western Reserve University |
Keywords: Rehabilitation Robotics, Modeling and Simulating Humans, Motion Control
Abstract: The trunk movements of an individual paralyzed by spinal cord injury (SCI) can be restored by Functional Neuromuscular Stimulation (FNS), a technique that applies low-level current to motor nerves to activate the muscles generating torques, and thus, produce trunk motions. FNS can be modulated to control trunk movements. However, a stabilizing modulation policy (i.e., control law) is difficult to derive due to the complexity of neuromusculoskeletal dynamics, which consist of skeletal dynamics (i.e., multi-joint rigid body dynamics) and neuromuscular dynamics (i.e., a highly nonlinear, nonautonomous, and input redundant dynamics). Therefore, an FNS-based control method that can stabilize the trunk without knowing the accurate skeletal and neuromuscular dynamics is desired. This work proposed an FNS generator, which consists of a robust nonlinear controller (RNC) that provides stabilizing torque command and an artificial neural network (ANN)- based torque-to-activation (T-A) map to ensure that the muscle generates the stabilizing torque to the skeleton. Due to the robustness and learning capability of this control framework, full knowledge of the trunk neuromusculoskeletal dynamics is not required. The proposed control framework has been tested in a simulation environment where an anatomically realistic 3D musculoskeletal model of the human trunk was manipulated to follow a time-varying reference that moves in the anterior-posterior and medial-lateral directions. From the results, it can be seen that the trunk motion converges to a satisfactory trajectory while the ANN is being updated. The results suggest the potential of this control framework for trunk tracking tasks in a clinical application.
|
| |
| 08:48-08:54, Paper MoAT2.4 | Add to My Program |
| Insole-Type Walking Assist Device Capable of Inducing Inversion-Eversion of the Ankle Angle to the Neutral Position |
|
| Itami, Taku | Aoyama Gakuin University |
| Date, Kazuki | Aoyama Gakuin University |
| Ishii, Yuuta | Aoyama Gakuin University |
| Yoneyama, Jun | Aoyama Gakuin University |
| Aoki, Takaaki | Gifu University |
Keywords: Prosthetics and Exoskeletons, Robotics and Automation in Life Sciences, Body Balancing
Abstract: In recent years, the aging of society has become a serious problem, especially in developed countries. Walking is an important element in extending healthy life expectancy in old age. In particular, induction of proper ankle joint alignment at heel contact is important during the gait cycle from the perspective of smooth weight transfer and reduction of burden on the knees and hip. In this study, we focus on the behavior of the ankle joint at heel contact and propose an insole-type assist device that can induce the ankle angle inversion/eversion rotation. The proposed device has tilting of the heel part from left to right in response to the rotation of a stepping motor, and an inertial sensor mounted inside controls the heel part to always maintain a horizontal position. The effectiveness of the proposed device is verified by evaluating the amount of lateral thrust of the knee joint of six healthy male subjects during a foot-stepping motion using motion capture system. The results showed that the amount of lateral thrust is significantly reduced by wearing the device with control.
|
| |
| 08:54-09:00, Paper MoAT2.5 | Add to My Program |
| Design for Hip Abduction Assistive Device Based on Relationship between Hip Joint Motion and Torque During Running |
|
| Lee, Myunghyun | Agency for Defense Development |
| Hong, Man Bok | Agency for Defense Development |
| Kim, Gwang Tae | Agency for Defense Development |
| Kim, Seonwoo | Agency for Defense Development |
Keywords: Physically Assistive Devices, Human Performance Augmentation, Mechanism Design
Abstract: Numerous attempts have been made to reduce metabolic energy while running with the help of assistive devices. A majority of studies on the assistive devices have focused on the assisting torque in the sagittal plane. In the case of running, however, the abduction torque in the frontal plane at the hip joint is greater than the flexion/extension torque in the sagittal plane. During running, as does an elastic body, the abduction torque and the motion of the hip joint have a linear relationship, but are opposite in direction. It is expected that the hip abduction torque can be assisted with a simple passive method by using an elastic body that reflects the movement characteristics of the hip joint. In this study, therefore, a system to assist hip abduction torque using a leaf spring was proposed with a prototype testing. While running with the assist system proposed, the leaf spring aids the abduction torque on the stance phase, and the torque is not generated due to the passive revolute joint on the swing phase. The joint angle is changed with respective to the rotation in the flexion/extension direction to prevent discomfort torque during swing phase and to increase the duration of the torque action during stance phase. A preliminary test was conducted on one subject using the prototype of the hip joint abduction torque assistive device. The participant with the assistive device reduced metabolic energy by 5% compared to the case without abduction torque assist while running at 2.5m/s. In order to increase the amount of metabolic reduction, the device shall be supplemented by system mass reduction and hip joint position optimization.
|
| |
| 09:00-09:06, Paper MoAT2.6 | Add to My Program |
| Dynamic Hand Proprioception Via a Wearable Glove with Fabric Sensors |
|
| Behnke, Lily | Yale University |
| Sanchez-Botero, Lina | Yale University |
| Johnson, William | Yale University |
| Agrawala, Anjali | Yale University |
| Kramer-Bottiglio, Rebecca | Yale University |
Keywords: Wearable Robotics, Soft Sensors and Actuators, Soft Robot Materials and Design
Abstract: Continuous enhancement in wearable technologies has led to several innovations in the healthcare, virtual reality, and robotics sectors. One form of wearable technology is wearable sensors for kinematic measurements of human motion. However, measuring the kinematics of human movement is a challenging problem as wearable sensors need to conform to complex curvatures and deform without limiting the user's natural range of motion. In fine motor activities, such challenges are further exacerbated by the dense packing of several joints, coupled joint motions, and relatively small deformations. This work presents the design, fabrication, and characterization of a thin, breathable sensing glove capable of reconstructing fine motor kinematics. The fabric glove features capacitive sensors made from layers of conductive and dielectric fabrics, culminating in a non-bulky and discrete glove design. This study demonstrates that the glove can reconstruct the joint angles of the wearer with a root mean square error of 7.2 degrees, indicating promising applicability to dynamic pose reconstruction for wearable technology and robot teleoperation.
|
| |
| 09:06-09:12, Paper MoAT2.7 | Add to My Program |
| A Wearable Robotic Rehabilitation System for Neuro-Rehabilitation Aimed at Enhancing Mediolateral Balance |
|
| Yu, Zhenyuan | North Carolina State University |
| Nalam, Varun | North Carolina State University |
| Alili, Abbas | NC State University |
| Huang, He (Helen) | North Carolina State University and University of North Carolina |
Keywords: Rehabilitation Robotics, Prosthetics and Exoskeletons, Physical Human-Robot Interaction
Abstract: There is increasing evidence of the role of compromised mediolateral balance in falls and the need for rehabilitation specifically focused on mediolateral direction for various populations with motor deficits. To address this need, we have developed a neurorehabilitation platform by integrating a wearable robotic hip abduction-adduction exoskeleton with a visual interface. The platform is expected to influence and rehabilitate the underlying visuomotor mechanisms in individuals by having users perform motion tasks based on visual feedback while the robot applies various controlled resistances governed by the admittance controller implemented in the robot. A preliminary study was performed on 3 non disabled individuals to analyze the performance of the system and observe any adaptation in hip joint kinematics and kinetics as a result of the visuomotor training under 4 different admittance conditions. All three subjects exhibited increased consistency of motion during training and interlimb coordination to achieve motion tasks, demonstrating the utility of the system. Further analysis of observed human-robot torque interactions and electromyography (EMG) signals, and its implication in neurorehabilitation aimed at populations suffering from chronic stroke are discussed.
|
| |
| 09:12-09:18, Paper MoAT2.8 | Add to My Program |
| Analysis of Lower Extremity Shape Characteristics in Various Walking Situations for the Development of Wearable Robot |
|
| Park, Joohyun | KAIST, KIST |
| Choi, Ho Seon | Yonsei University |
| In, HyunKi | Korea Institute of Science and Technology |
Keywords: Datasets for Human Motion, Wearable Robotics, Physical Human-Robot Interaction
Abstract: A strap is a frequently utilized component for securing wearable robots to their users in order to facilitate force transmission between humans and the devices. For the appropriate function of the wearable robot, the pressure between the strap and the skin should be maintained appropriately. Due to muscle contraction, the cross-section area of the human limb changes according to the movement of the muscle. The cross-section area change causes the change in the pressure applied by the strap. Therefore, for a new strap design to resolve this, it is necessary to understand the shape change characteristics of the muscle where the strap is applied. In this paper, the change in the circumference of the thigh and the calf during walking was measured and analyzed by multiple string pot sensors. With a treadmill and string pot sensors using potentiometers, torsion springs, and leg circumference changes were measured for different walking speeds and slopes. And, gait cycles were divided according to a signal from the FSR sensor inserted in the right shoe. From the experimental results, there were changes in the circumference of about 8.5mm and 3mm for the thigh and the calf, respectively. And we found tendencies in various walking circumstances such as walking speed and degree of the slope. It is confirmed that they can be used for estimation algorithms of gait cycles or gait circumstances.
|
| |
| 09:18-09:24, Paper MoAT2.9 | Add to My Program |
| Finding Biomechanically Safe Trajectories for Robot Manipulation of the Human Body in a Search and Rescue Scenario |
|
| Peiros, Lizzie | University of California, San Diego |
| Chiu, Zih-Yun | University of California, San Diego |
| Zhi, Yuheng | University of California, San Diego |
| Shinde, Nikhil | University of California San Diego |
| Yip, Michael C. | University of California, San Diego |
Keywords: Physical Human-Robot Interaction, Modeling and Simulating Humans, Dynamics
Abstract: There has been increasing awareness of the difficulties in reaching and extracting people from mass casualty scenarios, such as those arising from natural disasters. While platforms have been designed to consider reaching casualties and even carrying them out of harm's way, the challenge of physically repositioning a casualty from its found configuration to one suitable for extraction has not been explicitly explored. Furthermore, this type of planning problem needs to incorporate biomechanical safety considerations for the casualty. Thus, we present the problem formulation for biomechanically safe trajectory generation for repositioning limbs of unconscious human casualties. We describe biomechanical safety in robotics terms, describe mechanical descriptions of the dynamics of the robot-human coupled system, and the planning and trajectory optimization process that considers this coupled and constrained system. We finally evaluate the work over several variations of the problem and provide a live example. This work provides a crucial part of search and rescue that can be used in conjunction with past and present works involving robots and vision systems designed for search and rescue.
|
| |
| 09:24-09:30, Paper MoAT2.10 | Add to My Program |
| Mechanical Characterisation of Woven Pneumatic Active Textile |
|
| Marshall, Ruby | The University of Edinburgh |
| Souppez, Jean-Baptiste | Aston University |
| Khan, Mariya | Aston University |
| Viola, Ignazio Maria | University of Edinburgh |
| Nabae, Hiroyuki | Tokyo Institute of Technology |
| Suzumori, Koichi | Tokyo Institute of Technology |
| Stokes, Adam Andrew | University of Edinburgh |
| Giorgio-Serchi, Francesco | University of Edinburgh |
Keywords: Wearable Robotics, Soft Robot Materials and Design, Hydraulic/Pneumatic Actuators
Abstract: Active textiles have shown promising applications in soft robotics owing to their tunable stiffness and design flexibility. Given the breadth of the design space for planar and spatial arrangements of these woven structures, a rig- orous and generalizable characterisation of these systems is not yet available. In order to characterize the response of a stereotypical woven pattern to actuation, we undertake a parametric study of plain weave active fabrics and characterise their mechanical properties in accordance with the relevant ISO standards for varying muscle densities and both monotonically increasing/decreasing pressures. Tensile and flexural tests were undertaken on five plain weave samples made of a nylon 6 (polyamide) warp and EM20 McKibben S-muscle weft, for input pressures ranging from 0.00 MPa to 0.60 MPa, at three muscle densities, namely 100 m^-1, 74.26 m^-1 and 47.62 m^-1. Contrary to intuition, we find that a lower muscle density has a more prominent impact on the thickness, but a significantly lesser one on length, highlighting a critical dependency on the relative orientation among the loading, the passive textile and the muscle filaments. Hysteretic behaviour as large as 10% of the longitudinal contraction is observed on individual filaments and woven textiles, and its onset is identified in the shear between the rubber tube and the outer sleeve of the artificial muscle. Hysteresis is shown to be muscle density-dependent and responsible for a strongly asymmetrical response upon different pressure inputs. These findings provide new insights into the mechanical properties of active textiles with tunable stiffness, and may contribute to future developments in wearable technologies and biomedical devices.
|
| |
| 09:30-09:36, Paper MoAT2.11 | Add to My Program |
| Adaptive Symmetry Reference Trajectory Generation in Shared Autonomy for Active Knee Orthosis |
|
| Liu, Rongkai | University of Science and Technology of China(USTC) |
| Ma, Tingting | Chinese Academy of Sciences |
| Yao, Ningguang | University of Science and Technology of China |
| Li, Hao | Chinese Academy of Sciences |
| Zhao, Xinyan | University of Science and Technology of China |
| Wang, Yu | University of Science and Technology of China |
| Pan, Hongqing | Hefei Institutes of Physical Science |
| Song, Quanjun | Chinese Academy of Science |
Keywords: Human-Centered Robotics, Rehabilitation Robotics, Human-Robot Collaboration
Abstract: Gait symmetry training plays an essential role in the rehabilitation of hemiplegic patients and robotics-based gait training has been widely accepted by patients and clinicians. Reference trajectory generation for the affected side using the motion data of the unaffected side is an important way to achieve this. However, online generation gait reference trajectory requires the algorithm to provide correct gait phase delay and could reduce the impact of measurement noise from sensors and input uncertainty from users. Based on an active knee orthosis (AKO) prototype, this work presents an adaptive symmetric gait trajectory generation framework for the gait rehabilitation of hemiplegic patients. Using the adaptive nonlinear frequency oscillators (ANFO) and movement primitives, we implement online gait pattern encoding and adaptive phase delay according to the real-time user input. A shared autonomy (SA) module with online input validation and arbitration has been designed to prevent undesired movements from being transmitted to the actuator on the affected side. The experimental results demonstrate the feasibility of the framework. Overall, this work suggests that the proposed method has the potential to perform gait symmetry rehabilitation in an unstructured environment and provide a kinematic reference for torque-assist AKO.
|
| |
| 09:36-09:42, Paper MoAT2.12 | Add to My Program |
| Data-Driven Modeling for Gait Phase Recognition in a Wearable Exoskeleton Using Estimated Forces (I) |
|
| Park, Kyeong-Won | Republic of Korea Air Force Academy |
| Choi, Jungsu | Yeungnam University |
| Kong, Kyoungchul | Korea Advanced Institute of Science and Technology |
Keywords: Wearable Robots, AI-Based Methods, Human-Centered Robotics, Robust/Adaptive Control of Robotic Systems
Abstract: Accurate identification of gait phases is critical in effectively assessing the assistance provided by lower-limb exoskeletons. In this study, we propose a novel gait phase recognition system called ObsNet to analyze the gait of individuals with spinal cord injuries (SCI). To ensure the reliable use of exoskeletons, it is essential to maintain practicality and avoid exposing the system to unnecessary risks of fatigue, inaccuracy, or incompatibility with human-centered devices. Therefore, we propose a new approach to characterize exoskeletal-assisted gait by estimating forces on exoskeletal joints during walking. Although these estimated forces are potentially useful for detecting gait phases, their nonlinearities make it challenging for existing algorithms to generalize accurately. To address this challenge, we introduce a data-driven model that simultaneously captures both feature extraction and order dependencies, and enhance its performance through a threshold-based compensational method to filter out momentary errors. We evaluated the effectiveness of ObsNet through robotic walking experiments with two practical users with complete paraplegia. Our results indicate that ObsNet outperformed state-of-the-art methods that use joint information and other recurrent networks in identifying the gait phases of individuals with SCI (p < 0.05). We also observed reliable imitation of ground truth after compensation. Overall, our research highlights the potential of wearable technology to improve the daily lives of individuals with disabilities through accurate and stable state assessment.
|
| |
| MoAT3 Regular session, 140C |
Add to My Program |
| Collision Avoidance I |
|
| |
| Chair: Panagou, Dimitra | University of Michigan, Ann Arbor |
| Co-Chair: Pierson, Alyssa | Boston University |
| |
| 08:30-08:36, Paper MoAT3.1 | Add to My Program |
| Dynamic Multi-Query Motion Planning with Differential Constraints and Moving Goals |
|
| Gentner, Michael | Technical University of Munich and BMW AG |
| Zillenbiller, Fabian | Technical University of Munich and BMW AG |
| Kraft, Andr� | BMW AG, Germany |
| Steinbach, Eckehard | Technical University of Munich |
Keywords: Collision Avoidance, Motion and Path Planning, Industrial Robots
Abstract: Planning robot motions in complex environments is a fundamental research challenge and central to the autonomy, efficiency, and ultimately adoption of robots. While often the environment is assumed to be static, real-world settings, such as assembly lines, contain complex shaped, moving obstacles and changing target states. Therein robots must perform safe and efficient motions to achieve their tasks. In repetitive environments and multi-goal settings, reusable roadmaps can substantially reduce the overall query time. Most dynamic roadmap-based planners operate in state-time-space, which is computationally demanding. Interval-based methods store availabilities as node attributes and thereby circumvent the dimensionality increase. However, current approaches do not consider higher-order constraints, which can ultimately lead to collisions during execution. Furthermore, current approaches must replan when the goal changes. To this end, we propose a novel roadmap-based planner for systems with third-order differential constraints operating in dynamic environments with moving goals. We construct a roadmap with availabilities as node attributes. During the query phase, we use a Double-Integrator Minimum Time (DIMT) solver to recursively build feasible trajectories and accurately estimate arrival times. An exit node set in combination with a moving goal heuristic is used to efficiently find the fastest path through the roadmap to the moving goal. We evaluate our method with a simulated UAV operating in dynamic 2D environments and show that it also transfers to a 6-DoF manipulator. We show higher success rates than other state-of-the-art methods both in collision avoidance and reaching a moving goal.
|
| |
| 08:36-08:42, Paper MoAT3.2 | Add to My Program |
| Reactive and Safe Co-Navigation with Haptic Guidance |
|
| Coffey, Mela | Boston University |
| Zhang, Dawei | Boston University |
| Tron, Roberto | Boston University |
| Pierson, Alyssa | Boston University |
Keywords: Collision Avoidance, Telerobotics and Teleoperation, Human-Robot Collaboration
Abstract: We propose a co-navigation algorithm that enables a human and a robot to work together to navigate to a common goal. In this system, the human is responsible for making high-level steering decisions, and the robot, in turn, provides haptic feedback for collision avoidance and path suggestions while reacting to changes in the environment. Our algorithm uses optimized Rapidly-exploring Random Trees (RRT*) to generate paths to lead the user to the goal, via an attractive force feedback computed using a Control Lyapunov Function (CLF). We simultaneously ensure collision avoidance where necessary using a Control Barrier Function (CBF). We demonstrate our approach using simulations with a virtual pilot, and hardware experiments with a human pilot. Our results show that combining RRT* and CBFs is a promising tool for enabling collaborative human-robot navigation.
|
| |
| 08:42-08:48, Paper MoAT3.3 | Add to My Program |
| An MCTS-DRL Based Obstacle and Occlusion Avoidance Methodology in Robotic Follow-Ahead Applications |
|
| Leisiazar, Sahar | Simon Fraser University |
| Park, Edward J. | Simon Fraser University |
| Lim, Angelica | Simon Fraser University |
| Chen, Mo | Simon Fraser University |
Keywords: Robot Companions, Collision Avoidance, AI-Enabled Robotics
Abstract: We propose a novel methodology for robotic follow-ahead applications that address the critical challenge of obstacle and occlusion avoidance. Our approach effectively navigates the robot while ensuring avoidance of collisions and occlusions caused by surrounding objects. To achieve this, we developed a high-level decision-making algorithm that generates short-term navigational goals for the mobile robot. Monte Carlo Tree Search is integrated with a Deep Reinforcement Learning method to enhance the performance of the decision-making process and generate more reliable navigational goals. Through extensive experimentation and analysis, we demonstrate the effectiveness and superiority of our proposed approach in comparison to the existing follow-ahead human-following robotic methods. Our code is available at https://github.com/saharLeisiazar/follow-ahead-ros.
|
| |
| 08:48-08:54, Paper MoAT3.4 | Add to My Program |
| Proactive Model Predictive Control with Multi-Modal Human Motion Prediction in Cluttered Dynamic Environments |
|
| Heuer, Lukas | �rebro University, Robert Bosch GmbH |
| Palmieri, Luigi | Robert Bosch GmbH |
| Rudenko, Andrey | Robert Bosch GmbH |
| Mannucci, Anna | Robert Bosch GmbH Corporate Research |
| Magnusson, Martin | �rebro University |
| Arras, Kai Oliver | Bosch Research |
Keywords: Collision Avoidance, Human-Aware Motion Planning, Motion and Path Planning
Abstract: For robots navigating in dynamic environments, exploiting and understanding uncertain human motion prediction is key to generate efficient, safe and legible actions. The robot may perform poorly and cause hindrances if it does not reason over possible, multi-modal future social interactions. With the goal of further enhancing autonomous navigation in cluttered environments, we propose a novel formulation for nonlinear model predictive control including multi-modal predictions of human motion. As a result, our approach leads to less conservative, smooth and intuitive human-aware navigation with reduced risk of collisions, and shows a good balance between task efficiency, collision avoidance and human comfort. To show its effectiveness, we compare our approach against the state of the art in crowded simulated environments, and with real-world human motion data from the THOR dataset. This comparison shows that we are able to improve task efficiency, keep a larger distance to humans and significantly reduce the collision time, when navigating in cluttered dynamic environments. Furthermore, the method is shown to work robustly with different state-of-the-art human motion predictors.
|
| |
| 08:54-09:00, Paper MoAT3.5 | Add to My Program |
| A Novel Obstacle-Avoidance Solution with Non-Iterative Neural Controller for Joint-Constrained Redundant Manipulators |
|
| Li, Weibing | Sun Yat-Sen University |
| Yi, Zilian | Sun Yat-Sen University |
| Zou, Yanying | Sun Yat-Sen University |
| Wu, Haimei | Sun Yat-Sen University |
| Yang, Yang | Sun Yat-Sen University |
| Pan, Yongping | Sun Yat-Sen University |
Keywords: Collision Avoidance, Optimization and Optimal Control, Redundant Robots
Abstract: Obstacle avoidance (OA) and joint-limit avoidance (JLA) are essential for redundant manipulators to ensure safe and reliable robotic operations. One solution to OA and JLA is to incorporate the involved constraints into a quadratic programming (QP), by solving which OA and JLA can be achieved. There exist a few non-iterative solvers such as zeroing neural networks (ZNNs), which can solve each sampled QP problem using only one iteration, yet no solution is suitable for OA and JLA due to the absence of some derivative information. To tackle these issues, this paper proposes a novel solution with a non-iterative neural controller termed NCP-ZNN for joint-constrained redundant manipulators. Unlike iterative methods, the neural controller involving derivative information proposed in this paper possesses some positive features including non-iterative computing and convergence with time. In this paper, the reestablished OA-JLA scheme is first introduced. Then, the design details of the neural controller are presented. After that, some comparative simulations based on a PA10 robot and an experiment based on a Franka Emika Panda robot are conducted, demonstrating that the proposed neural controller is more competent in OA and JLA.
|
| |
| 09:00-09:06, Paper MoAT3.6 | Add to My Program |
| TTC4MCP: Monocular Collision Prediction Based on Self-Supervised TTC Estimation |
|
| Li, Changlin | Shanghai Jiao Tong University |
| Qian, Yeqiang | Shanghai Jiao Tong University |
| Sun, Cong | Shanghai Jiao Tong University |
| Yan, Weihao | Shanghai Jiao Tong University |
| Wang, Chunxiang | Shanghai Jiaotong University |
| Yang, Ming | Shanghai Jiao Tong University |
Keywords: Collision Avoidance, Computer Vision for Transportation, Deep Learning for Visual Perception
Abstract: Vision-based collision prediction for autonomous driving is a challenging task due to the dynamic movement of vehicles and diverse types of obstacles. Most existing methods rely on object detection algorithms, which only predict predefined collision targets, such as vehicles and pedestrians, and cannot anticipate emergencies caused by unknown obstacles. To address this limitation, we propose a novel approach using pixel-wise time-to-collision (TTC) estimation for monocular collision prediction (TTC4MCP). Our approach predicts TTC and optical flow from monocular images and identifies potential collision areas using feature clustering and motion analysis. To overcome the challenge of training TTC estimation models without ground truth data in new scenes, we propose a self-supervised TTC training method, enabling collision prediction in a wider range of scenarios. TTC4MCP is evaluated on multiple road conditions and demonstrates promising results in terms of accuracy and robustness.
|
| |
| 09:06-09:12, Paper MoAT3.7 | Add to My Program |
| DAMON: Dynamic Amorphous Obstacle Navigation Using Topological Manifold Learning and Variational Autoencoding |
|
| Dastider, Apan | University of Central Florida |
| Mingjie, Lin | University of Central Florida |
Keywords: Collision Avoidance, Deep Learning Methods, Motion and Path Planning
Abstract: DAMON leverages manifold learning and vari- ational autoencoding to achieve obstacle avoidance, allowing for motion planning through adaptive graph traversal in a pre-learned low-dimensional hierarchically-structured manifold graph that captures intricate motion dynamics between a robotic arm and its obstacles. This versatile and reusable approach is applicable to various collaboration scenarios. The primary advantage of DAMON is its ability to embed information in a low-dimensional graph, eliminating the need for repeated computation required by current sampling-based methods. As a result, it offers faster and more efficient motion planning with significantly lower computational overhead and memory footprint. In summary, DAMON is a breakthrough methodology that addresses the challenge of dynamic obstacle avoidance in robotic systems and offers a promising solution for safe and efficient human-robot collaboration. Our approach has been experimentally validated on a 7-DoF robotic manipulator in both simulation and physical settings. DAMON enables the robot to learn and generate skills for avoiding previously-unseen obstacles while achieving predefined objectives. We also optimize DAMON�s design parameters and performance using an analytical framework. Our approach outperforms mainstream methodologies, including RRT, RRT*, Dynamic RRT*, L2RRT, and MpNet, with 40% more trajectory smoothness and over 65% improved latency performance, on average.
|
| |
| 09:12-09:18, Paper MoAT3.8 | Add to My Program |
| Gatekeeper: Online Safety Verification and Control for Nonlinear Systems in Dynamic Environments |
|
| Agrawal, Devansh | University of Michigan |
| Chen, Ruichang | University of Michigan |
| Panagou, Dimitra | University of Michigan, Ann Arbor |
Keywords: Collision Avoidance, Motion and Path Planning
Abstract: This paper presents the gatekeeper algorithm, a real-time and computationally-lightweight method to ensure that nonlinear systems can operate safely in dynamic environments despite limited perception. Gatekeeper integrates with existing path planners and feedback controllers by introducing an additional verification step that ensures that proposed trajectories can be executed safely, despite nonlinear dynamics subject to bounded disturbances, input constraints and partial knowledge of the environment. Our key contribution is that (A) we propose an algorithm to recursively construct committed trajectories, and (B) we prove that tracking the committed trajectory ensures the system is safe for all time into the future. The method is demonstrated on a complicated firefighting mission in a dynamic environment, and compares against the state-of-the-art techniques for similar problems.
|
| |
| 09:18-09:24, Paper MoAT3.9 | Add to My Program |
| Combinatorial Disjunctive Constraints for Obstacle Avoidance in Path Planning |
|
| Garcia, Raul | Rice University |
| Hicks, Illya V. | Rice University |
| Huchette, Joey | Google Research |
Keywords: Collision Avoidance, Motion and Path Planning, Optimization and Optimal Control
Abstract: We present a new approach for modeling avoidance constraints in 2D environments, in which waypoints are assigned to obstacle-free polyhedral regions. Constraints of this form are often formulated as mixed-integer programming (MIP) problems employing big-M techniques - however, these are generally not the strongest formulations possible with respect to the MIP's convex relaxation (so called ideal formulations), potentially resulting in larger computational burden. We instead model obstacle avoidance as combinatorial disjunctive constraints and leverage the independent branching scheme to construct small, ideal formulations. As our approach requires a biclique cover for an associated graph, we exploit the structure of this class of graphs to develop a fast subroutine for obtaining biclique covers in polynomial time. We also contribute an open-source Julia library named ClutteredEnvPathOpt to facilitate computational experiments of MIP formulations for obstacle avoidance. Experiments have shown our formulation is more compact and remains competitive on a number of instances compared with standard big-M techniques, for which solvers possess highly optimized procedures.
|
| |
| 09:24-09:30, Paper MoAT3.10 | Add to My Program |
| Reachability-Aware Collision Avoidance for Tractor-Trailer System with Non-Linear MPC and Control Barrier Function |
|
| Tang, Yucheng | University of Applied Sciences Karlsruhe |
| Mamaev, Ilshat | Karlsruhe Institute of Technology |
| Qin, Jing | Karlsruhe University of Applied Sciences |
| Wurll, Christian | Karlsruhe University of Applied Sciences |
| Hein, Bj�rn | Karlsruhe University of Applied Sciences |
Keywords: Collision Avoidance, Optimization and Optimal Control, Nonholonomic Motion Planning
Abstract: This paper proposes a reachability-aware model predictive control with a discrete control barrier function for backward obstacle avoidance for a tractor-trailer system. The framework incorporates the state-variant reachable set obtained through sampling-based reachability analysis and symbolic regression into the objective function of model predictive control. By optimizing the intersection of the reachable set and iterative non-safe region generated by the control barrier function, the system demonstrates better performance in terms of safety with a constant decay rate, while enhancing the feasibility of the optimization problem. The proposed algorithm improves real-time performance due to a shorter horizon and outperforms the state-of-the-art algorithms in the simulation environment and on a real robot.
|
| |
| 09:30-09:36, Paper MoAT3.11 | Add to My Program |
| Continuous Implicit SDF Based Any-Shape Robot Trajectory Optimization |
|
| Zhang, Tingrui | Zhejiang University |
| Wang, Jingping | Zhejiang University |
| Xu, Chao | Zhejiang University |
| Gao, Alan | Fan'gang |
| Gao, Fei | Zhejiang University |
Keywords: Collision Avoidance, Whole-Body Motion Planning and Control, Motion and Path Planning
Abstract: Optimization-based trajectory generation methods are widely used in whole-body planning for robots. However, existing work either oversimplifies the robot�s geometry and environment representation, resulting in a conservative trajectory or suffers from a huge overhead in maintaining additional information such as the Signed Distance Field (SDF). To bridge the gap, we consider the robot as an implicit function, with its surface boundary represented by the zero-level set of its SDF. We further employ another implicit function to lazily compute the signed distance to the swept volume generated by the robot and its trajectory. The computation is efficient by exploiting continuity in space-time, and the implicit function guarantees precise and continuous collision evaluation even for nonconvex robots with complex surfaces. We also propose a trajectory optimization pipeline applicable to the implicit SDF. Simulation and real-world experiments validate the high performance of our approach for arbitrarily shaped robot trajectory optimization. The code will be released at https://github.com/ZJU-FAST-Lab/Implicit-SDF-Planner.
|
| |
| 09:36-09:42, Paper MoAT3.12 | Add to My Program |
| Robo-Centric ESDF: A Fast and Accurate Whole-Body Collision Evaluation Tool for Any-Shape Robotic Planning |
|
| Geng, Shuang | Zhejiang University |
| Wang, Qianhao | Zhejiang University |
| Xie, Lei | State Key Laboratory of Industrial Control Technology, Zhejiang |
| Xu, Chao | Zhejiang University |
| Cao, Yanjun | Zhejiang University, Huzhou Institute of Zhejiang University |
| Gao, Fei | Zhejiang University |
Keywords: Collision Avoidance, Motion and Path Planning
Abstract: For letting mobile robots travel flexibly through complicated environments, increasing attention has been paid to the whole-body collision evaluation. Most existing works either opt for the conservative corridor-based methods that impose strict requirements on the corridor generation, or ESDF-based methods that suffer from high computational overhead. It is still a great challenge to achieve fast and accurate whole-body collision evaluation. In this paper, we propose a Robo-centric ESDF (RC-ESDF) that is pre-built in the robot body frame and is capable of seamlessly applied to any-shape mobile robots, even for those with non-convex shapes. RC-ESDF enjoys lazy collision evaluation, which retains only the minimum information sufficient for whole-body safety constraint and significantly speed up trajectory optimization. Based on the analytical gradients provided by RC-ESDF, we optimize the position and rotation of robot jointly, with whole-body safety, smoothness, and dynamical feasibility taken into account. Extensive simulation and real-world experiments verified the reliability and generalizability of our method.
|
| |
| 09:42-09:48, Paper MoAT3.13 | Add to My Program |
| Global Map Assisted Multi-Agent Collision Avoidance Via Deep Reinforcement Learning Around Complex Obstacles |
|
| Du, Yuanyuan | Cuhk, Sz |
| Zhang, Jianan | Peking University |
| Xu, Jie | Cush, Sz |
| Cheng, Xiang | Pku |
| Cui, Shuguang | Cush, Sz |
Keywords: Collision Avoidance, Motion and Path Planning, Reinforcement Learning
Abstract: State-of-the-art multi-agent collision avoidance algorithms face limitations when applied to cluttered public environments, where obstacles may have a variety of shapes and structures. The issue arises because most of these algorithms are agent-level methods. They concentrate solely on preventing collisions between the agents while the obstacles are handled merely out-of-policy. Obstacle-aware policies output an action considering both agents and obstacles. Current obstacle-aware algorithms, mainly based on Lidar sensor data, struggle to handle collision avoidance around complex obstacles. To resolve this issue, this paper investigates how to find a better way to travel around diverse obstacles. In particular, we present a global map assisted collision avoidance algorithm which, following the lead of a high-level goal guide and using an obstacle representation called distance map, considers other agents and obstacles simultaneously. Moreover, our model can be loaded into each agent individually, making it applicable to large maps or more agents. Simulation results indicate that our model outperforms the state-of-the-art algorithms, showing in scenarios with complex obstacles. We present a notion for incorporating global information in decentralized decision-making, along with a method for extending agent-level algorithms to adjust to cluttered environments in real-world scenarios.
|
| |
| MoAT4 Regular session, 140D |
Add to My Program |
| Control Applications |
|
| |
| Chair: Stuart, Hannah | UC Berkeley |
| Co-Chair: Poonawala, Hasan A. | University of Kentucky |
| |
| 08:30-08:36, Paper MoAT4.1 | Add to My Program |
| A Geometric Sufficient Condition for Contact Wrench Feasibility |
|
| Li, Shenggao | University of Notre Dame |
| Chen, Hua | Southern University of Science and Technology |
| Zhang, Wei | Southern University of Science and Technology |
| Wensing, Patrick M. | University of Notre Dame |
Keywords: Body Balancing, Humanoid and Bipedal Locomotion, Whole-Body Motion Planning and Control
Abstract: A fundamental problem in legged locomotion is to verify whether a desired trajectory satisfies all physical constraints, especially those for maintaining the contacts. Although foot tipping can be avoided via the Zero Moment Point (ZMP) condition, preventing foot sliding and twisting leads to the more complex Contact Wrench Cone (CWC) constraints. This paper proposes an efficient algorithm to certify the inclusion of a net contact wrench in the CWC on flat ground with uniform friction. In addition to checking the ZMP criteria, the proposed method also verifies whether the linear force and the yaw moment are feasible. The key step in the algorithm is a new exact geometric characterization of the yaw moment limits in the case when the support polygon is approximated by a single supporting line. We propose two approaches to select this approximating line, providing an accurate inner approximation of the ground truth yaw moment limits with only 18.80% (resp. 7.13%) error. The methods require only 1/150 (resp. 1/139) computation time compared to the exact CWC method based on conic programming. As a benchmark, approximating the CWC using square friction pyramids requires similar computation times as the exact CWC, but has > 19.35% error. Unlike the ZMP condition, our method provides a sufficient condition for contact wrench feasibility.
|
| |
| 08:36-08:42, Paper MoAT4.2 | Add to My Program |
| Aggregating Single-Wheeled Mobile Robots for Omnidirectional Movements |
|
| Wang, Meng | Beijing Institute for General Artificial Intelligence |
| Su, Yao | Beijing Institute for General Artificial Intelligence |
| Li, Hang | Beijing Institute for General Artificial Intelligence |
| Li, Jiarui | Peking University |
| Liang, Jixaing | Beihang University |
| Liu, Hangxin | Beijing Institute for General Artificial Intelligence (BIGAI) |
Keywords: Education Robotics, Art and Entertainment Robotics
Abstract: This paper presents a novel modular robot system that can self-reconfigure to achieve omnidirectional movements for collaborative object transportation. Each robotic module is equipped with a steerable omni-wheel for navigation and is shaped as a regular icositetragon with a permanent magnet installed on each corner for stable docking. After aggregating multiple modules and forming a structure that can cage a target object, we have developed an optimization-based method to compute the distribution of all wheels' heading directions, which enables efficient omnidirectional movements of the structure. By implementing a hierarchical controller on our prototyped system in both simulation and experiment, we validated the trajectory-tracking performance of an individual module and a team of six modules in multiple navigation and collaborative object transportation setting. The results demonstrate that the proposed system can maintain a stable caging formation and achieve smooth transportation, indicating the effectiveness of our hardware and locomotion designs.
|
| |
| 08:42-08:48, Paper MoAT4.3 | Add to My Program |
| An On-Wall-Rotating Strategy for Effective Upstream Motion of Untethered Millirobot: Principle, Design and Demonstration (I) |
|
| Yang, Liu | City University of Hong Kong |
| Zhang, Tieshan | City University of Hong Kong |
| Huang, Han | City University of Hong Kong |
| Ren, Hao | City University of Hongkong |
| Shang, Wanfeng | Shenzhen Institutes of Advanced Technology, Chinese Academy of S |
| Shen, Yajing | The Hong Kong University of Science and Technology |
Keywords: on-wall-rotating, Medical Robots and Systems, Modeling, Control, and Learning for Soft Robots, Micro/Nano Robots
Abstract: Untethered miniature robots that can access narrow and harsh environments in the body show great potential for future biomedical applications. Despite many types of millirobot have been developed, swimming against the fast blood flow remains a big challenge due to the low staying still ability of the robot and the large hydraulic resistance from blood. This work proposes an on-wall-rotating strategy and a streamlined millirobot to achieve the effective upstream motion in the lumen. First, the principle of on-wall-rotating strategy and the dynamic motion model of the millirobot is established. Then, a critical safety angle θs is theoretically and experimentally analyzed for the safe and stable control of the robot. After that, a series of experiment are conducted to verify the proposed driving strategy. The resutls suggest that the robot is able to move at speed of 5 mm/s against flow velocity of 138 mm/s, which is comparable to the blood flow of 2700 mm3 /s and several times faster than other reported driving strategies. This work offers a new strategy for the untethered magnetic robot construction and control for blood vessels, which would promote the application of millirobot for biomedical engineering.
|
| |
| 08:48-08:54, Paper MoAT4.4 | Add to My Program |
| Smooth Stride Length Change of Rat Robot with a Compliant Actuated Spine Based on CPG Controller |
|
| Huang, Yuhong | Technische Universit�t M�nchen |
| Bing, Zhenshan | Technical University of Munich |
| Zhang, Zitao | Sun Yat-Sen University |
| Huang, Kai | Sun Yat-Sen University |
| Morin, Fabrice O. | Technische Universit�t M�nchen |
| Knoll, Alois | Tech. Univ. Muenchen TUM |
Keywords: Robust/Adaptive Control, Motion Control, Biologically-Inspired Robots
Abstract: The aim of this research is to investigate the relationship between spinal flexion and quadruped locomotion in a rat robot equipped with a compliant spine, controlled by a central pattern generator (CPG). The study reveals that spinal flexion can enhance limb stride length, but it may also cause significant and unexpected motion disturbances during stride length variations. To address this issue, this paper proposes a CPG model driven by spinal flexion and a novel oscillator that incorporates a circular limit cycle and accounts for the anticipated stride length transition process. This approach effectively matches the torque change with the dynamics of stride length changes, leading to lower energy consumption. Extensive simulations are conducted to evaluate the efficacy of the proposed oscillator and compare it with the original kinetic model and other CPG models. The results demonstrate that the designed CPG model with the proposed oscillator yields smoother gait transitions during stride length variations and reduces energy consumption.
|
| |
| 08:54-09:00, Paper MoAT4.5 | Add to My Program |
| Learning Terrain-Adaptive Locomotion with Agile Behaviors by Imitating Animals |
|
| Li, Tingguang | The Chinese University of Hong Kong |
| Zhang, Yizheng | Tencent |
| Zhang, Chong | Tencent |
| Zhu, Qingxu | Tencent |
| Sheng, Jiapeng | Shandong University |
| Chi, Wanchao | Tencent |
| Zhou, Cheng | Tencent |
| Han, Lei | Tencent Robotics X |
Keywords: Machine Learning for Robot Control, Reinforcement Learning, AI-Based Methods
Abstract: In this paper, we present a general learning framework for controlling a quadruped robot that can mimic the behavior of real animals and traverse challenging terrains. Our method consists of two steps: an imitation learning step to learn from motions of real animals, and a terrain adaptation step to enable generalization to unseen terrains. We capture motions from a Labrador on various terrains to facilitate terrain adaptive locomotion. Our experiments demonstrate that our policy can traverse various terrains and produce a natural-looking behavior. We deployed our method on the real quadruped robot Max via zero-shot simulation-to-reality transfer, achieving a speed of 1.1 m/s on stairs climbing.
|
| |
| 09:00-09:06, Paper MoAT4.6 | Add to My Program |
| A Stable Adaptive Extended Kalman Filter for Estimating Robot Manipulators Link Velocity and Acceleration |
|
| Baradaran Birjandi, Seyed Ali | Technical University of Munich |
| Khurana, Harshit | EPFL |
| Billard, Aude | EPFL |
| Haddadin, Sami | Technical University of Munich |
Keywords: Sensor Fusion, Kinematics
Abstract: One can estimate the velocity and acceleration of robot manipulators by utilizing nonlinear observers. This involves combining inertial measurement units (IMUs) with the motor encoders of the robot through a model-based sensor fusion technique. This approach is lightweight, versatile (suitable for a wide range of trajectories and applications), and straightforward to implement. In order to further improve the estimation accuracy while running the system, we propose to adapt the noise information in this paper. This would automatically reduce the system vulnerability to imperfect modelings and sensor changes. Moreover, viable strategies to maintain the system stability are introduced. Finally, we thoroughly evaluate the overall framework with a seven DoF robot manipulator whose links are equipped with IMUs.
|
| |
| 09:06-09:12, Paper MoAT4.7 | Add to My Program |
| Provably Correct Sensor-Driven Path-Following for Unicycles Using Monotonic Score Functions |
|
| Clark, Benton | University of Kentucky |
| Hariprasad, Varun | Paul Laurence Dunbar High School |
| Poonawala, Hasan A. | University of Kentucky |
Keywords: Sensor-based Control, Autonomous Vehicle Navigation, Machine Learning for Robot Control
Abstract: This paper develops a provably stable sensor-driven controller for path-following applications of robots with unicycle kinematics, one specific class of which is the wheeled mobile robot (WMR). The sensor measurement is converted to a scalar value (the score) through some mapping (the score function); the latter may be designed or learned. The score is then mapped to forward and angular velocities using a simple rule with three parameters. The key contribution is that the correctness of this controller only relies on the score function satisfying monotonicity conditions with respect to the underlying state - local path coordinates - instead of achieving specific values at all states. The monotonicity conditions may be checked online by moving the WMR, without state estimation, or offline using a generative model of measurements such as in a simulator. Our approach provides both the practicality of a purely measurement-based control and the correctness of state-based guarantees. We demonstrate the effectiveness of this path-following approach on both a simulated and a physical WMR that use a learned score function derived from a binary classifier trained on real depth images.
|
| |
| 09:12-09:18, Paper MoAT4.8 | Add to My Program |
| Contact Reduction with Bounded Stiffness for Robust Sim-To-Real Transfer of Robot Assembly |
|
| Nghia, Vuong | Nanyang Technological University |
| Pham, Quang-Cuong | NTU Singapore |
Keywords: Simulation and Animation, Reinforcement Learning, Machine Learning for Robot Control
Abstract: In sim-to-real Reinforcement Learning (RL), a policy is trained in a simulated environment and then deployed on the physical system. The main challenge of sim-to-real RL is to overcome the emph{reality gap} - the discrepancies between the real world and its simulated counterpart. Using generic geometric representations, such as convex decomposition, triangular mesh, signed distance field can improve simulation fidelity, and thus potentially narrow the reality gap. Common to these approaches is that many contact points are generated for geometrically-complex objects, which slows down simulation and may cause numerical instability. Contact reduction methods address these issues by limiting the number of contact points, but the validity of these methods for sim-to-real RL has not been confirmed. In this paper, we present a contact reduction method with bounded stiffness to improve the simulation accuracy. Our experiments show that the proposed method critically enables training RL policy for a tight-clearance double pin insertion task and successfully deploying the policy on a rigid, position-controlled physical robot.
|
| |
| 09:18-09:24, Paper MoAT4.9 | Add to My Program |
| Trajectory Tracking Via Multiscale Continuous Attractor Networks |
|
| Joseph, Therese | Queensland University of Technology |
| Fischer, Tobias | Queensland University of Technology |
| Milford, Michael J | Queensland University of Technology |
Keywords: Neurorobotics, Cognitive Modeling
Abstract: Animals and insects showcase remarkably robust and adept navigational abilities, up to literally circumnavigating the globe. Primary progress in robotics inspired by these natural systems has occurred in two areas: highly theoretical computational neuroscience models, and handcrafted systems like RatSLAM and NeuroSLAM. In this research, we present work bridging the gap between the two, in the form of Multiscale Continuous Attractor Networks (MCAN), that combine the multiscale parallel spatial neural networks of the previous theoretical models with the real-world robustness of the robot-targeted systems, to enable trajectory tracking over large velocity ranges. To overcome the limitations of the reliance of previous systems on hand-tuned parameters, we present a genetic algorithm-based approach for automated tuning of these networks, substantially improving their usability. To provide challenging navigational scale ranges, we open source a flexible city-scale navigation simulator that adapts to any street network, enabling high throughput experimentation. In extensive experiments using the city-scale navigation environment and Kitti, we show that the system is capable of stable dead reckoning over a wide range of velocities and environmental scales, where a single-scale approach fails.
|
| |
| 09:24-09:30, Paper MoAT4.10 | Add to My Program |
| Design and Control of a Ballbot Drivetrain with High Agility, Minimal Footprint, and High Payload |
|
| Xiao, Chenzhang | University of Illinois at Urbana-Champaign |
| Mansouri, Mahshid | University of Illinois at Urbana-Champaign |
| Lam, David | University of Michigan - Ann Arbor |
| Ramos, Joao | University of Illinois at Urbana-Champaign |
| Hsiao-Wecksler, Elizabeth T. | University of Illinois at Urbana-Champaign |
Keywords: Body Balancing, Wheeled Robots, Underactuated Robots
Abstract: This paper presents the design and control of a ballbot drivetrain that aims to achieve high agility, minimal footprint, and high payload capacity while maintaining dynamic stability. Two hardware platforms and analytical models were developed to test design and control methodologies. The full-scale ballbot prototype (MiaPURE) was constructed using off-the-shelf components and designed to have agility, footprint, and balance similar to that of a walking human. The planar inverted pendulum testbed (PIPTB) was developed as a reduced-order testbed for quick validation of system performance. We then proposed a simple yet robust cascaded LQR-PI controller to balance and maneuver the ballbot drivetrain with a heavy payload. This is crucial because the drivetrain is often subject to high stiction due to elastomeric components in the torque transmission system. This controller was first tested in the PIPTB to compare with traditional LQR and cascaded PI-PD controllers, and then implemented in the ballbot drivetrain. The MiaPURE drivetrain was able to carry a payload of 60 kg, achieve a maximum speed of 2.3 m/s, and come to a stop from a speed of 1.4 m/s in 2 seconds in a selected translation direction. Finally, we demonstrated the omnidirectional movement of the ballbot drivetrain in an indoor environment as a payload-carrying robot and a human-riding mobility device. Our experiments demonstrated the feasibility of using the ballbot drivetrain as a universal mobility platform with agile movements, minimal footprint, and high payload capacity using our proposed design and control methodologies.
|
| |
| 09:30-09:36, Paper MoAT4.11 | Add to My Program |
| A Bayesian Reinforcement Learning Method for Periodic Robotic Control under Significant Uncertainty |
|
| Jia, Yuanyuan | Ritsumeikan University |
| Uriguen Eljuri, Pedro Miguel | Ritsumeikan University |
| Taniguchi, Tadahiro | Ritsumeikan University |
Keywords: Dexterous Manipulation, Medical Robots and Systems, Reinforcement Learning
Abstract: This paper addresses the lack of research on periodic reinforcement learning for physical robot control by presenting a 3-phase periodic Bayesian reinforcement learning method for uncertain environments. Drawing on cognition theory, the proposed approach achieves effective convergence with fewer training episodes. The coach-based demonstration phase narrows the search space and establishes a foundation for a coarse-to-fine control strategy. The reconnaissance phase enhances adaptability by discovering a valuable global representation, and the operation phase produces accurate robotic control by applying the learned representation and periodically updating local information. Comparative analysis with state-of-the-art methods validates the efficacy of our approach on exemplar control tasks in simulation and a biomedical project involving a simulated cranial window task.
|
| |
| 09:36-09:42, Paper MoAT4.12 | Add to My Program |
| Residual Physics Learning and System Identification for Sim-To-Real Transfer of Policies on Buoyancy Assisted Legged Robots |
|
| Sontakke, Nitish Rajnish | Georgia Institute of Technology |
| Chae, Hosik | University of California at Los Angeles |
| Lee, Sangjoon | University of California, Los Angeles |
| Huang, Tianle | Georgia Institute of Technology |
| Hong, Dennis | UCLA |
| Ha, Sehoon | Georgia Institute of Technology |
Keywords: Model Learning for Control, Reinforcement Learning, Legged Robots
Abstract: The light and soft characteristics of Buoyancy Assisted Lightweight Legged Unit (BALLU) robots have a great potential to provide intrinsically safe interactions in environments involving humans, unlike many heavy and rigid robots. However, their unique and sensitive dynamics impose challenges to obtaining robust control policies in the real world. In this work, we demonstrate robust sim-to-real transfer of control policies on the BALLU robots via system identification and our novel residual physics learning method, Environment Mimic (EnvMimic). First, we model the nonlinear dynamics of the actuators by collecting hardware data and optimizing the simulation parameters. Rather than relying on standard supervised learning formulations, we utilize deep reinforcement learning to train an external force policy to match real-world trajectories, which enables us to model residual physics with greater fidelity. We analyze the improved simulation fidelity by comparing the simulation trajectories against the real-world ones. We finally demonstrate that the improved simulator allows us to learn better walking and turning policies that can be successfully deployed on the hardware of BALLU.
|
| |
| 09:42-09:48, Paper MoAT4.13 | Add to My Program |
| DiffClothAI: Differentiable Cloth Simulation with Intersection-Free Frictional Contact and Differentiable Two-Way Coupling with Articulated Rigid Bodies |
|
| Yu, Xinyuan | National University of Singapore |
| Zhao, Siheng | Nanjing University |
| Luo, Siyuan | Xi'an Jiaotong University |
| Yang, Gang | National University of Singapore |
| Shao, Lin | National University of Singapore |
Keywords: Simulation and Animation, Optimization and Optimal Control
Abstract: Differentiable Simulations have recently proven useful for various robotic manipulation tasks, including cloth manipulation. In robotic cloth simulation, it is crucial to maintain intersection-free properties. We present DiffClothAI, a differentiable cloth simulation with intersection-free friction contact and two-way coupling with articulated rigid bodies. DiffClothAI integrates the Project Dynamics and Incremental Potential Contact coherently and proposes an effective method to derive gradients in the Cloth Simulation. It also establishes the differentiable coupling mechanism between articulated rigid bodies and cloth. We conduct a comprehensive evaluation of DiffClothAI�s effectiveness and accuracy and perform a variety of experiments in downstream robotic manipulation tasks. Supplemental materials and videos are available on our project webpage.
|
| |
| 09:48-09:54, Paper MoAT4.14 | Add to My Program |
| Walk-Burrow-Tug: Legged Anchoring Analysis Using RFT-Based Granular Limit Surfaces |
|
| Huh, Tae Myung | UC Berkeley |
| Cao, Cyndia | University of California Berkeley |
| Aderibigbe, Jadesola | University of California, Berkeley |
| Moon, Deaho | Korea Institute of Science and Technology |
| Stuart, Hannah | UC Berkeley |
Keywords: Contact Modeling, Legged Robots, Mobile Manipulation
Abstract: We develop a new resistive force theory based granular limit surface (RFT-GLS) method to predict and guide behaviors of forceful ground robots. As a case study, we harness a small mobile robotic system � MiniRQuad (296g) � to �walk-burrow-tug;� it actively exploits ground anchoring by burrowing its legs to tug loads. RFT-GLS informs the selection of efficient strategies to transport sleds with varying masses. The granular limit surface (GLS), a wrench boundary that separates stationary and kinetic behavior, is computed using 3D resistive force theory (RFT) for a given body and set of motion twists. This limit surface is then used to predict the quasi-static trajectory of the robot when it fails to withstand an external load. We find that the RFT-GLS enables accurate force and motion predictions in laboratory tests. For control applications, a pre-composed state space map of the twist-wrench pairs enables computationally efficient simulations to improve robotic anchoring strategies.
|
| |
| MoAT5 Regular session, 140E |
Add to My Program |
| Mechanism Design I |
|
| |
| Chair: Tadakuma, Kenjiro | Tohoku University |
| Co-Chair: Sorokin, Maks | Georgia Institute of Technology |
| |
| 08:30-08:36, Paper MoAT5.1 | Add to My Program |
| Tube Mechanism with 3-Axis Rotary Joints Structure to Achieve Variable Stiffness Using Positive Pressure |
|
| Onda, Issei | Tohoku University |
| Watanabe, Masahiro | Tohoku University |
| Tadakuma, Kenjiro | Tohoku University |
| Abe, Kazuki | Tohoku University |
| Tadokoro, Satoshi | Tohoku University |
Keywords: Mechanism Design, Hydraulic/Pneumatic Actuators, Flexible Robotics
Abstract: Studies on soft robotics have explored mechanisms for switching the stiffness of a robot structure. The hybrid soft-rigid approach, which combines soft materials and high-rigidity structures, is commonly used to achieve variable stiffness mechanisms. In particular, the positive-pressurization method has attracted significant attention in recent years as it can eliminate the constraints on driving pressure. Moreover, it can change the shape holding force according to internal pressure. In this study, a variable stiffness mechanism, comprising 3-axis rotary ball joints and a single chamber, was devised via frictional force using positive pressure. The prototype can change joint angles arbitrarily when no pressure is applied and can hold joint angles when positive pressure is applied. Using a theoretical model of the torque required to hold the joint angle, we simulated the holding torque using finite element modeling analysis and measured the holding torque in the pitch and roll directions when internal pressure was applied. Based on the interaction of the theoretical model, measurement, and FEM analysis, it was confirmed that the value of the holding torque in the roll direction was approximately π/2 times larger than that in the pitch direction for each value of the internal pressure. Further, we evaluated the FEM value, theoretical value, and measured value of the holding torque by performing pairwise numerical comparisons. Our approach will aid the design of effective stiffening mechanisms for soft robotics applications.
|
| |
| 08:36-08:42, Paper MoAT5.2 | Add to My Program |
| Timor Python: A Toolbox for Industrial Modular Robotics |
|
| K�lz, Jonathan | Technical University of Munich |
| Mayer, Matthias | Technical University of Munich |
| Althoff, Matthias | Technische Universit�t M�nchen |
Keywords: Cellular and Modular Robots, Methods and Tools for Robot System Design, Software Tools for Robot Programming
Abstract: Modular Reconfigurable Robots (MRRs) represent an exciting path forward for industrial robotics, opening up new possibilities for robot design. Compared to monolithic manipulators, they promise greater flexibility, improved maintainability, and cost-efficiency. However, there is no tool or standardized way to model and simulate assemblies of modules in the same way it has been done for robotic manipulators for decades. We introduce the Toolbox for Industrial Modular Robotics (Timor), a Python toolbox to bridge this gap and integrate modular robotics into existing simulation and optimization pipelines. Our open-source library offers model generation and task-based configuration optimization for MRRs. It can easily be integrated with existing simulation tools � not least by offering URDF export of arbitrary modular robot assemblies. Moreover, our experimental study demonstrates the effectiveness of Timor as a tool for designing modular robots optimized for specific use cases.
|
| |
| 08:42-08:48, Paper MoAT5.3 | Add to My Program |
| Ultra-Low Inertia 6-DOF Manipulator Arm for Touching the World |
|
| Nishii, Kazutoshi | Toyota Motor Corporation |
| Okumatsu, Yohishiro | Toyota Motor Corporation |
| Hatano, Akira | Toyota Motor Corporation |
Keywords: Mechanism Design, Tendon/Wire Mechanism
Abstract: As robotic intelligence increases, so does the importance of agents that collect data from real-world environments. When learning in contact with the environment, one must consider how to minimize the impact on the environment and maintain reproducibility. To achieve this, the contact force with the environment must be reduced. One way to achieve this is to reduce the inertia of the arm. In this study, we present an arm we have developed with 6 degrees of freedom and low inertia. The inertia of our arm has been significantly reduced compared to previous research, and experiments have confirmed that it also has low joint friction torque and good contact sensitivity.
|
| |
| 08:48-08:54, Paper MoAT5.4 | Add to My Program |
| Determination of the Characteristics of Gears of Robot-Like Systems by Analytical Description of Their Structure |
|
| Landler, Stefan | Technical University of Munich |
| Molina Blanco, Ra�l | Technical University of Munich |
| Otto, Michael | Technical University of Munich, Chair of Machine Elements, Gear |
| Vogel-Heuser, Birgit | Technical University Munich |
| Zimmermann, Markus | Technical University of Munich |
| Stahl, Karsten | Technical University of Munich |
Keywords: Methods and Tools for Robot System Design, Product Design, Development and Prototyping, Engineering for Robotic Systems
Abstract: The axes of robots and robot-like systems (RLS) usually include e-motor-gearbox-arrangements for optimal connection of the elements. The characteristics of the drive system and thus also of the robot depend strongly on the gears. Different gearbox designs are available which differ in stiffness, efficiency and further properties. For an application-optimal design of RLS a uniform documentation and a comparability of gearbox concepts is a decisive factor. The application-optimal design is supported by an interdisciplinary approach between mechanical engineering and software design, guided by adequate product development methodology. The quite heterogeneous characterization of gearboxes for RLS which is currently the state of the art is a relevant obstacle in the flexible and optimal design of RLS. The paper shows the analysis of the gear structure with unified symbols for specific machine elements and contact types. The introduced method gives insight into the mechanical structure of the gearboxes. Similarities between gear types can thus be revealed. This also enables the classification of new developments in the state of the art. Moreover, the developed method for analyzing the gear structure can be used to determine the characteristics of gears. Examples for these characteristics are backlash, efficiency or stiffness. Specifically, the stiffness of gears can be synthesized by the force action of individual contacts and the individual phenomena that occur with them. The representation by individual phenomena also makes it possible to extend the calculation to include influencing parameters such as temperature that have not been sufficiently taken into account so far.
|
| |
| 08:54-09:00, Paper MoAT5.5 | Add to My Program |
| Tension Jamming for Deployable Structures |
|
| Hasegawa, Daniel | Harvard University |
| Aktas, Buse | ETH Zurich |
| Howe, Robert D. | Harvard University |
Keywords: Mechanism Design, Compliant Joints and Mechanisms, Soft Robot Materials and Design
Abstract: Deployable structures provide adaptability and versatility for applications such as temporary architectures, space structures, and biomedical devices. Jamming is a mechanical phenomenon with which dramatic changes in stiffness can be achieved by increasing the frictional and kinematic coupling between constituents in a structure by applying an external pressure. This study applies jamming, which has been primarily used in medium-scale soft robotics applications to large-scale deployable structures with components that are soft and compact during transport, but rigid upon deployment. It proposes a new jamming structure with a novel built-in actuation mechanism which enables high-performance at large scales: a composite beam made of rectangular segments along a cable which can be pre-tensioned and thus jammed. Two theoretical models are developed to provide insights into the mechanical behavior of the composite beams and predict their performance under loading. A scale model of a deployable bridge is built using the tension-based composite beams, and the bridge is deployed and assembled by air with a drone demonstrating the versatility and viability of the proposed approach for robotics applications.
|
| |
| 09:00-09:06, Paper MoAT5.6 | Add to My Program |
| Task2Morph: Differentiable Task-Inspired Framework for Contact-Aware Robot Design |
|
| Cai, Yishuai | National University of Defense Technology |
| Yang, Shaowu | National University of Defense Technology |
| Li, Minglong | National University of Defense Technology |
| Chen, Xinglin | National University of Defense Technology |
| Mao, Yunxin | National University of Defense Technology |
| Yi, Xiaodong | National University of Defense Technology |
| Yang, Wenjing | State Key Laboratory of High Performance Computing (HPCL), Schoo |
Keywords: Evolutionary Robotics, AI-Enabled Robotics
Abstract: Optimizing the morphologies and the controllers that adapt to various tasks is a critical issue in the field of robot design, aka. embodied intelligence. Previous works typically model it as a joint optimization problem and use search-based methods to find the optimal solution in the morphology space. However, they ignore the implicit knowledge of task-to-morphology mapping which can directly inspire robot design. For example, flipping heavier boxes tends to require more muscular robot arms. This paper proposes a novel and general differentiable task-inspired framework for contact-aware robot design called Task2Morph. We abstract task features highly related to task performance and use them to build a task-to-morphology mapping. Further, we embed the mapping into a differentiable robot design process, where the gradient information is leveraged for both the mapping learning and the whole optimization. The experiments are conducted on three scenarios, and the results validate that Task2Morph outperforms DiffHand, which lacks a task-inspired morphology module, in terms of efficiency and effectiveness.
|
| |
| 09:06-09:12, Paper MoAT5.7 | Add to My Program |
| Constraint Programming for Component-Level Robot Design |
|
| Wilhelm, Andrew | Cornell University |
| Napp, Nils | Cornell University |
Keywords: Methods and Tools for Robot System Design, Formal Methods in Robotics and Automation, Product Design, Development and Prototyping
Abstract: Effective design automation for building robots would make development faster and easier while also less prone to design errors. However, complex multi-domain constraints make creating such tools difficult. One persistent challenge in achieving this goal of design automation is the fundamental problem of component selection, an optimization problem where, given a general robot model, components must be selected from a possibly large set of catalogs to minimize design objectives while meeting target specifications. Different approaches to this problem have used Monotone Co-Design Problems (MCDPs) or linear and quadratic programming, but these require judicious system approximations that affect the accuracy of the solution. We take an alternative approach formulating the component selection problem as a combinatorial optimization problem, which does not require any system approximations, and using constraint programming (CP) to solve this problem with a depth-first branch-and-bound algorithm. As the efficacy of CP critically depends upon the orderings of variables and their domain values, we present two heuristics specific to the problem of component selection that significantly improve solve time compared to traditional constraint satisfaction programming heuristics. We also add redundant constraints to the optimization problem to further improve run time by evaluating certain global constraints before all relevant variables are assigned. We demonstrate that our CP approach can find optimal solutions from over 20 trillion candidate solutions in only seconds, up to 48 times faster than an MCDP approach solving the same problem. Finally, for three different robot designs we build the corresponding robots to physically validate that the selected components meet the target design specifications.
|
| |
| 09:12-09:18, Paper MoAT5.8 | Add to My Program |
| Design and Implementation of a Two-Limbed 3T1R Haptic Device |
|
| Kang, Long | Nanjing University of Science and Technology |
| Yang, Yang | Nanjing University of Information Science and Technology |
| Yi, Byung-Ju | Hanyang University |
Keywords: Mechanism Design, Haptics and Haptic Interfaces, Parallel Robots
Abstract: This paper presents a haptic device with a simple architecture of only two limbs that can provide translational motion in three degrees of freedom (DOF) and one-DOF rotational motion. Actuation redundancy eliminates all forward-kinematic singularities and improves the motion-force transmission property. Thanks to the special structure of the kinematic chains, all actuators are close to the base and full gravity compensation is achieved passively by using springs. Force producibility analysis shows that this haptic device is able to produce long-term continuous force feedback of 15�30 N in each direction. By developing a prototype of the haptic device and a virtual three-dimensional simulator, a preliminary performance evaluation of the haptic device was conducted. In addition, a torque distribution algorithm considering a relaxed form of actuator-torque saturation was experimentally evaluated, and a comparison with other algorithms reveals that this algorithm offers several advantages.
|
| |
| 09:18-09:24, Paper MoAT5.9 | Add to My Program |
| Combining Measurement Uncertainties with the Probabilistic Robustness for Safety Evaluation of Robot Systems |
|
| Baek, Woo-Jeong | Karlsruhe Institute of Technology (KIT) |
| Ledermann, Christoph | Karlsruhe Institute of Technology |
| Asfour, Tamim | Karlsruhe Institute of Technology (KIT) |
| Kroeger, Torsten | Karlsruher Institut F�r Technologie (KIT) |
Keywords: Methods and Tools for Robot System Design, Robot Safety, Probability and Statistical Methods
Abstract: In this paper, we present a method to engage measurement uncertainties with the probabilistic robustness to one system uncertainty measure. Providing a metric indicating the potential occurrence of dangerous situations is highly essential for safety-critical robot applications. Due to the difficulty of finding a quantifiable, unambiguous representation however, such a metric has not been derived to date. In case of sensory devices, measurement uncertainties are usually provided by manufacturer specifications. Apart from that, several contributions demonstrate that the accuracy of neural networks is verifiable via the robustness. However, state-of-the-art literature is mainly concerned with theoretical investigations such that scarce attention has been devoted to the transfer of the robustness to real-world applications. To fill this gap, we show how the probabilistic robustness can be made useful for evaluating quantitative safety limits. Our key idea is to exploit the analogy between measurement uncertainties and the probabilistic robustness: While measurement uncertainties reflect possible shifts due to technical limitations, the robustness refers to the tolerated amount of distortions in the input data for an unaltered output. Inspired by this analogy, we combine both measures to quantify the system uncertainty online. We validate our method in different settings under real-world conditions. Our findings exemplify that incorporating the novel uncertainty metric effectively prevents the rate of dangerous situations in Human-Robot Collaboration.
|
| |
| 09:24-09:30, Paper MoAT5.10 | Add to My Program |
| Computational Design of Closed-Chain Linkages: Respawn Algorithm for Generative Design |
|
| Ivolga, Dmitriy | ITMO University |
| Nasonov, Kirill | ITMO University |
| Borisov, Ivan | ITMO University |
| Kolyubin, Sergey | ITMO University |
Keywords: Mechanism Design, Legged Robots, Grippers and Other End-Effectors
Abstract: Designing robots is a multiphase process aimed at solving a multi-criteria optimization problem to find the best possible detailed design. Generative design (GD) aims to accelerate the design process compared to manual design, since GD allows exploring and exploiting the vast design space more efficiently. In the field of robotics, however, relevant research focuses mostly on the generation of fully-actuated open chain kinematics, which is trivial in mechanical engineering perspective. Within this paper, we address the problem of generative design of closed-chain linkage mechanisms. A GD algorithm has to be able to generate meaningful mechanisms which satisfy conditions of existence. We propose an optimization-driven algorithm for generation of planar closed-chain linkages to follow a predefined trajectory. The algorithm creates an unlimited range of physically reproducible design alternatives that can be further tested in simulation. These tests could be done in order to find solutions that satisfy extra criteria, e.g., desired dynamic behavior or low energy consumption. The proposed algorithm is called "respawn" since it builds a new linkage after the ancestor has been tested in a virtual environment in pursuit for the optimal solution. To show that the algorithm is general enough, we show a set of generated linkages that can be used for a wide class of robots.
|
| |
| 09:30-09:36, Paper MoAT5.11 | Add to My Program |
| On Designing a Learning Robot: Improving Morphology for Enhanced Task Performance and Learning |
|
| Sorokin, Maks | Georgia Institute of Technology |
| Fu, Chuyuan | X, the Moonshot Factory |
| Tan, Jie | Google |
| Liu, Karen | Stanford University |
| Bai, Yunfei | Google X |
| Lu, Wenlong | Everyday Robots, X the Moonshot Factory |
| Ha, Sehoon | Georgia Institute of Technology |
| Khansari, Mohi | Google X |
Keywords: Mechanism Design, Visual Learning, Evolutionary Robotics
Abstract: As robots become more prevalent, optimizing their design for better performance and efficiency is becoming increasingly important. However, current robot design practices overlook the impact of perception and design choices on a robot's learning capabilities. To address this gap, we propose a comprehensive methodology that accounts for the interplay between the robot's perception, hardware characteristics, and task requirements. Our approach optimizes the robot's morphology holistically, leading to improved learning and task execution proficiency. To achieve this, we introduce a Morphology-AGnostIc Controller (MAGIC), which helps with the rapid assessment of different robot designs. The MAGIC policy is efficiently trained through a novel PRIvileged Single-stage learning via latent alignMent (PRISM) framework, which also encourages behaviors that are typical of robot onboard observation. Our simulation-based results demonstrate that morphologies optimized holistically improve the robot performance by 15-20% on various manipulation tasks, and require 25x less data to match human-expert made morphology performance. In summary, our work contributes to the growing trend of learning-based approaches in robotics and emphasizes the potential in designing robots that facilitate better learning.
|
| |
| 09:36-09:42, Paper MoAT5.12 | Add to My Program |
| Development of a Dynamic Quadruped with Tunable, Compliant Legs |
|
| Chen, Fuchen | Arizona State University |
| Tao, Weijia | Arizona State University |
| Aukes, Daniel | Arizona State University |
Keywords: Mechanism Design, Compliant Joints and Mechanisms, Legged Robots
Abstract: To facilitate the study of how passive leg stiffness influences locomotion dynamics and performance, we have developed an affordable and accessible 400 g quadruped robot driven by tunable compliant laminate legs, whose series and parallel stiffness can be easily adjusted; fabrication only takes 2.5 hours for all four legs. The robot can trot at 0.52 m/s or 4.4 body lengths per second with a 3.2 cost of transport (COT). Through locomotion experiments in both the real world and simulation we demonstrate that legs with different stiffness have an obvious impact on the robot�s average speed, COT, and pronking height. When the robot is trotting at 4 Hz in the real world, changing the leg stiffness yields a maximum improvement of 37.1% in speed and 62.0% in COT, showing its great potential for future research on locomotion controller designs and leg stiffness optimizations.
|
| |
| 09:42-09:48, Paper MoAT5.13 | Add to My Program |
| A Passive Compliance Obstacle Crossing Robot for Power Line Inspection and Maintenance |
|
| Chen, Minghao | Institute of Automation, Chinese Academy of Sciences |
| Cao, Yinghua | Institute of Automation,Chinese Academy of Sciences |
| Tian, Yunong | Institute of Automation, Chinese Academy of Sciences |
| Li, En | Institute of Automation, Chinese Academy of Sciences |
| Liang, Zize | Institute of Automation, Chinese Academy of Sciences |
| Tan, Min | Institute of Automation, Chinese Academy of Sciences |
Keywords: Mechanism Design, Industrial Robots, Engineering for Robotic Systems
Abstract: In scenarios of the overhead power line system, manual methods are inefficient and unsafe. Meanwhile, the majority of cantilevered robots have poor efficiency when crossing obstacles. This paper proposes a novel power line inspection and maintenance robot to solve these problems. The robot employs a passive compliance obstacle-crossing principle, which could rapidly cross obstacles with the cooperation of gas springs and climbing wheels. Under high payload, the robot could take 5-15 seconds without any complex strategies to roll over obstacles. A variable configuration platform is also designed, which has a multiple line mode and a single line mode. It makes the robot suitable for different kinds of overhead power lines. Meanwhile, the related adaptability analyses are presented. Manipulators are also installed to help the robot perform specific maintenance tasks. The results of lab experiments and field tests reveal that the robot could stably and rapidly cross obstacles, such as suspension clamps, vibration dampers, and spacers, and could perform three kinds of maintenance tasks on the line.
|
| |
| 09:48-09:54, Paper MoAT5.14 | Add to My Program |
| Open Robot Hardware: Progress, Benefits, Challenges, and Best Practices (I) |
|
| Patel, Vatsal | Yale University |
| Liarokapis, Minas | The University of Auckland |
| Dollar, Aaron | Yale University |
Keywords: Methods and Tools for Robot System Design, Product Design, Development and Prototyping, Mechanism Design
Abstract: Open-source projects have seen widespread adoption and improved availability in robotics over recent years. The rapid pace of progress in robotics is in part fueled by open-source projects, allowing researchers to implement novel ideas and approaches quickly. Open-source hardware in particular lowers the barrier of entry to new technologies, and can further accelerate innovation in robotics. But it is also more difficult to propagate in comparison to software because it requires replicating physical components. We present a review on Open Robot Hardware (ORH), by first highlighting key benefits and challenges encountered by users and developers of ORH, and relaying some best practices that can be adopted in developing an ORH. Then, we survey over 60 major ORH works in the different domains within robotics. Lastly, we identify strategies exemplified by the surveyed works to further detail the development process and guide developers through the design, documentation, and dissemination stages of an ORH project.
|
| |
| MoAT6 Regular session, 140FG |
Add to My Program |
| Modeling, Control, and Learning for Soft Robots I |
|
| |
| Chair: Gillespie, Brent | University of Michigan |
| Co-Chair: Karydis, Konstantinos | University of California, Riverside |
| |
| 08:30-08:36, Paper MoAT6.1 | Add to My Program |
| Modelling of Tendon Driven Robot Based on Constraint Analysis and Pseudo-Rigid Body Model |
|
| Troeung, Charles | Monash University |
| Liu, Shaotong | Monash University |
| Chen, Chao | Monash University |
Keywords: Modeling, Control, and Learning for Soft Robots, Tendon/Wire Mechanism, Soft Robot Applications
Abstract: Quasi-static models of tendon-driven continuum robots (TDCR) require consideration of both the kinematic and static conditions simultaneously. While the Pseudo-Rigid Body (PRB-3R) model has been demonstrated to be efficient, existing works ignore the mechanical effect of the tendons such as elongation. In addition, the static equilibrium equations for the partially constrained tendons have been expressed in different forms within the literature. This leads to inconsistent simulation results which have not been validated by experimental data when external loads are applied. Furthermore, the inverse problem for solving the required inputs for a prescribed end effector pose has not been studied for the PRB-3R model. In this work, we introduce a new modelling approach based on constraint analysis (CA) of a multi-body system and Lagrange multipliers to systematically derive all the relevant governing equations required for a planar TDCR. This method can include tendon mechanics and efficiently solve for the direct and inverse kinetostatic models with either forces or displacements as the actuation inputs. We validate the proposed CA method using numerical simulation of a benchmark model and experimental data.
|
| |
| 08:36-08:42, Paper MoAT6.2 | Add to My Program |
| An Improved Koopman-MPC Framework for Data-Driven Modeling and Control of Soft Actuators |
|
| Wang, Jiajin | Southeast University |
| Xu, Baoguo | Southeast University |
| Lai, Jianwei | Southeast University |
| Wang, Yifei | Southeast University |
| Hu, Cong | Guilin University of Electronic Technology |
| Li, Huijun | Southeast University |
| Song, Aiguo | Southeast University |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Sensors and Actuators
Abstract: The challenge of achieving precise control of soft actuators with strong nonlinearity is mainly due to the difficulty of deriving models suitable for model-based control techniques. Fortunately, Koopman operator provides a data-driven method for constructing control-oriented models of nonlinear systems to achieve model predictive control (MPC). It is called the Koopman-MPC framework, which is theoretically effective for soft actuators. Nevertheless, in this framework, a critical challenge is to select correct basis functions for Koopman-based modeling. Furthermore, there is room for improvement in control performance. To overcome these problems, this letter presents an improved Koopman-MPC framework to efficiently implement model-based control techniques for soft actuators. Firstly, we propose a systematic method for selecting the basis functions, which extends the measurement coordinates with derivative and time-delay coordinates and uses the spares identification of nonlinear dynamics (SINDy) algorithm. Secondly, an incremental model predictive control with dynamic constraints (IMPCDC) is developed based on the Koopman model. Finally, several comparative experiments are conducted to verify the utility of the improved Koopman-MPC framework for data-driven modeling and control of soft actuators.
|
| |
| 08:42-08:48, Paper MoAT6.3 | Add to My Program |
| Soft Robot Shape Estimation: A Load-Agnostic Geometric Method |
|
| Sorensen, Christian | Brigham Young University |
| Killpack, Marc | Brigham Young University |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Sensors and Actuators, Soft Robot Applications
Abstract: In this paper we present a novel kinematic representation of a soft continuum robot to enable full shape estimation using a purely geometric solution. The kinematic representation involves using length varying piecewise constant curvature segments to describe the deformed shape of the robot. Based on this kinematic representation, we can use overlapping length sensors to estimate the shape of continuously deformable bodies without prior knowledge of the current loading conditions. We show an implementation that assumes one change in curvature along the length of a joint, using string potentiometers as an arc length sensor, and an orientation measurement from the tip of the continuum joint. For 56 randomized joint configurations, we estimate the shape of a 250 mm long continually deformable robot with less then 2.5 mm of average error. The average error is reported for each of the 10 different equally spaced points along the length, demonstrating the ability to accurately represent the full shape of the soft robot.
|
| |
| 08:48-08:54, Paper MoAT6.4 | Add to My Program |
| Robust Generalized Proportional Integral Control for Trajectory Tracking of Soft Actuators in a Pediatric Wearable Assistive Device |
|
| Mucchiani, Caio | University of California Riverside |
| Liu, Zhichao | University of California, Riverside |
| Sahin, Ipsita | University of California, Riverside |
| Kokkoni, Elena | University of California, Riverside |
| Karydis, Konstantinos | University of California, Riverside |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Robot Applications, Wearable Robotics
Abstract: Soft robotics hold promise in the development of safe yet powered assistive wearable devices for infants. Key to this is the development of closed-loop controllers that can help regulate pneumatic pressure in the device's actuators in an effort to induce controlled motion at the user's limbs and be able to track different types of trajectories. This work develops a controller for soft pneumatic actuators aimed to power a pediatric soft wearable robotic device prototype for upper extremity motion assistance. The controller tracks desired trajectories for a system of soft pneumatic actuators supporting two-degree-of-freedom shoulder joint motion on an infant-sized engineered mannequin. The degrees of freedom assisted by the actuators are equivalent to shoulder motion (abduction/adduction and flexion/extension). Embedded inertial measurement unit sensors provide real-time joint feedback. Experimental data from performing reaching tasks using the engineered mannequin are obtained and compared against ground truth to evaluate the performance of the developed controller. Results reveal the proposed controller leads to accurate trajectory tracking performance across a variety of shoulder joint motions.
|
| |
| 08:54-09:00, Paper MoAT6.5 | Add to My Program |
| Data-Efficient Online Learning of Ball Placement in Robot Table Tennis |
|
| Tobuschat, Philip | Max Planck Institue for Intelligent Systems, T�bingen |
| Ma, Hao | Max Planck Institute for Intelligent Systems |
| B�chler, Dieter | Max Planck Institute for Intelligent Systems T�bingen |
| Sch�lkopf, Bernhard | Max Planck Institute for Intelligent Systems |
| Muehlebach, Michael | ETH |
Keywords: Modeling, Control, and Learning for Soft Robots, Bioinspired Robot Learning, Machine Learning for Robot Control
Abstract: We present an implementation of an online optimization algorithm for hitting a predefined target when returning ping-pong balls with a table tennis robot. The online algorithm optimizes over so-called interception policies, which define the manner in which the robot arm intercepts the ball. In our case, these are composed of the state of the robot arm (position and velocity) at interception time. Gradient information is provided to the optimization algorithm via the mapping from the interception policy to the landing point of the ball on the table, which is approximated with a black-box and a grey-box approach. Our algorithm is applied to a robotic arm with four degrees of freedom that is driven by pneumatic artificial muscles. As a result, the robot arm is able to return the ball onto any predefined target on the table after about 2-5 iterations. We highlight the robustness of our approach by showing rapid convergence with both the black-box and the grey-box gradients. In addition, the small number of iterations required to reach close proximity to the target also underlines the sample efficiency. A demonstration video can be found here: https://youtu.be/VC3KJoCss0k.
|
| |
| 09:00-09:06, Paper MoAT6.6 | Add to My Program |
| Learning Reduced-Order Soft Robot Controller |
|
| Liang, Chen | Zhejiang University |
| Gao, Xifeng | Tencent America |
| Wu, Kui | Tencent |
| Pan, Zherong | Tencent America |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Robot Applications, Optimization and Optimal Control
Abstract: Deformable robots are notoriously difficult to model or control due to its high-dimensional configuration spaces. Direct trajectory optimization suffers from the curse-of-dimensionality and incurs a high computational cost, while learning-based controller optimization methods are sensitive to hyper-parameter tuning. To overcome these limitations, we hypothesize that high fidelity soft robots can be both simulated and controlled by restricting to low-dimensional spaces. Under such assumption, we propose a two-stage algorithm to identify such simulation- and control-spaces. Our method first identifies the so-called simulation-space that captures the salient deformation modes, to which the robot's governing equation is restricted. We then identify the control-space, to which control signals are restricted. We propose a multi-fidelity Riemannian Bayesian bilevel optimization to identify task-specific control spaces. We show that the dimension of control-space can be less than 10 for a high-DOF soft robot to accomplish walking and swimming tasks, allowing low-dimensional MPC controllers to be applied to soft robots with tractable computational complexity.
|
| |
| 09:06-09:12, Paper MoAT6.7 | Add to My Program |
| A Single-Parameter Model for Soft Bellows Actuators under Axial Deformation and Loading |
|
| Treadway, Emma | Trinity University |
| Brei, Melissa | University of Michigan |
| Sedal, Audrey | McGill University |
| Gillespie, Brent | University of Michigan |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Sensors and Actuators, Hydraulic/Pneumatic Actuators
Abstract: Soft fluidic actuators are becoming popular for their backdrivability, potential for high power density, and their support for power supply through flexible tubes. Control and design of such actuators requires serviceable models that describe how they relate fluid pressure and flow to mechanical force and motion. We present a simple 2-port model of a bellows actuator that accounts for the relationships among fluid and mechanical variables imposed by the kinematics of the deforming bellows structure and accounts for elastic energy stored in the actuator�s thermoplastic material structure. Elastic energy storage due to axial deformation is captured by revolving a differential strip whose linear elastic behavior is a nonlinear function of the actuator length. The model is evaluated through experiments in which either actuator length and pressure or force and pressure are imposed. The model has an error of 9.8% of the force range explored and yields insight into the effects of geometry changes. The resulting model can be used for model-based control or actuator design across the full operating range and can be exercised under either imposed force or imposed actuator length.
|
| |
| 09:12-09:18, Paper MoAT6.8 | Add to My Program |
| Task and Configuration Space Compliance of Continuum Robots Via Lie Group and Modal Shape Formulations |
|
| Orekhov, Andrew | Carnegie Mellon University |
| Johnston, Garrison | Vanderbilt University |
| Simaan, Nabil | Vanderbilt University |
Keywords: Modeling, Control, and Learning for Soft Robots, Kinematics, Flexible Robotics
Abstract: Continuum robots suffer large deflections due to internal and external forces. Accurate modeling of their passive compliance is necessary for accurate environmental interaction, especially in scenarios where direct force sensing is not practical. This paper focuses on deriving analytic formulations for the compliance of continuum robots that can be modeled as Kirchhoff rods. Compared to prior works, the approach presented herein is not subject to the constant-curvature assumptions to derive the configuration space compliance, and we do not rely on computationally-expensive finite difference approximations to obtain the task space compliance. Using modal approximations over curvature space and Lie group integration, we obtain closed-form expressions for the task and configuration space compliance matrices of continuum robots, thereby bridging the gap between constant-curvature analytic formulations of configuration space compliance and variable curvature task space compliance. We first present an analytic expression for the compliance of a single Kirchhoff rod. We then extend this formulation for computing both the task space and configuration space compliance of a tendon-actuated continuum robot. We then use our formulation to study the tradeoffs between computation cost and modeling accuracy as well as the loss in accuracy from neglecting the Jacobian derivative term in the compliance model. Finally, we experimentally validate the model on a tendon-actuated continuum segment, demonstrating the model's ability to predict passive deflections with error below 11.5% percent of total arc length.
|
| |
| 09:18-09:24, Paper MoAT6.9 | Add to My Program |
| A Localization Framework for Boundary Constrained Soft Robots |
|
| Tanaka, Koki | Illinois Institute of Technology |
| Zhou, Qiyuan | Illinois Institute of Technology |
| Srivastava, Ankit | Illinois Institute of Technology |
| Spenko, Matthew | Illinois Institute of Technology |
Keywords: Modeling, Control, and Learning for Soft Robots, Localization, Soft Robot Applications
Abstract: Soft robots possess unique capabilities for adapting to the environment and interacting with it safely. However, their deformable nature also poses challenges for controlling their movement. In particular, the large deformations of a soft robot make it difficult to localize its individual body parts, which in turn impedes effective control. This paper introduces a novel localization framework designed for soft robots that are constrained by boundaries and benefit from unique hardware architecture. To this end, we propose a method that exploits the flexible boundaries of the robot to create an onboard sensor capable of measuring the relative distances between its sub-robots. This measurement data is incorporated into a linear Kalman filter for accurate localization. We evaluate the framework's performance in benchmark and dynamic cases and demonstrate its effectiveness in improving localization accuracy compared to an IMU-based approach. The results also show that the proposed method achieves sufficient localization accuracy for contact-based mapping, enabling the robot to sense the location of obstacles in the environment. Finally, we validate the proposed framework using a physical prototype of a boundary-constrained soft robot and demonstrate its ability to accurately estimate the robot's shape. This framework has the potential to enable soft robots to autonomously navigate and map unknown environments, which could be beneficial for a variety of exploration tasks.
|
| |
| 09:24-09:30, Paper MoAT6.10 | Add to My Program |
| EViper: A Scalable Platform for Untethered Modular Soft Robots |
|
| Cheng, Hsin | Princeton University |
| Zheng, Zhiwu | Princeton University |
| Kumar, Prakhar | Princeton University |
| Afridi, Wali | Ithaca Senior High School |
| Kim, Ben | Princeton University |
| Wagner, Sigurd | Princeton University |
| Verma, Naveen | Princeton University |
| Sturm, James | Princeton University |
| Chen, Minjie | Princeton University |
Keywords: Modeling, Control, and Learning for Soft Robots
Abstract: Soft robots present unique capabilities, but have been limited by the lack of scalable technologies for construction and the complexity of algorithms for efficient control and motion. These depend on soft-body dynamics, high-dimensional actuation patterns, and external/onboard forces. This paper presents scalable methods and platforms to study the impact of weight distribution and actuation patterns on fully untethered modular soft robots. An extendable Vibrating Intelligent Piezo-Electric Robot (eViper), together with an open-source Simulation Framework for Electroactive Robotic Sheet (SFERS) implemented in PyBullet, was developed as a platform to analyze the complex weight-locomotion interaction. By integrating power electronics, sensors, actuators, and batteries onboard, the eViper platform enables rapid design iteration and evaluation of different weight distribution and control strategies for the actuator arrays. The design supports both physics-based modeling and data-driven modeling via onboard automatic data-acquisition capabilities. We show that SFERS can provide useful guidelines for optimizing the weight distribution and actuation patterns of the eViper, thereby achieving maximum speed or minimum cost of transport (COT).
|
| |
| 09:30-09:36, Paper MoAT6.11 | Add to My Program |
| Domain Randomization for Robust, Affordable and Effective Closed-Loop Control of Soft Robots |
|
| Tiboni, Gabriele | Politecnico Di Torino |
| Protopapa, Andrea | Politecnico Di Torino |
| Tommasi, Tatiana | Politecnico Di Torino |
| Averta, Giuseppe | Politecnico Di Torino |
Keywords: Modeling, Control, and Learning for Soft Robots, Reinforcement Learning
Abstract: Soft robots are gaining popularity thanks to their intrinsic safety to contacts and adaptability. However, the potentially infinite number of Degrees of Freedom makes their modeling a daunting task, and in many cases only an approximated description is available. This challenge makes reinforcement learning (RL) based approaches inefficient when deployed on a realistic scenario, due to the large domain gap between models and the real platform. In this work, we demonstrate, for the first time, how Domain Randomization (DR) can solve this problem by enhancing RL policies for soft robots with: i) robustness w.r.t. unknown dynamics parameters; ii) reduced training times by exploiting drastically simpler dynamic models for learning; iii) better environment exploration, which can lead to exploitation of environmental constraints for optimal performance. Moreover, we introduce a novel algorithmic extension of previous adaptive domain randomization methods for the automatic inference of dynamics parameters for deformable objects. We provide an extensive evaluation in simulation on four different tasks and two soft robot designs, opening interesting perspectives for future research on Reinforcement Learning for closed-loop soft robot control.
|
| |
| 09:36-09:42, Paper MoAT6.12 | Add to My Program |
| Implementation of a Cosserat Rod-Based Configuration Tracking Controller on a Multi-Segment Soft Robotic Arm |
|
| Doroudchi, Azadeh | Arizona State University |
| Qiao, Zhi | ASU |
| Zhang, Wenlong | Arizona State University |
| Berman, Spring | Arizona State University |
Keywords: Modeling, Control, and Learning for Soft Robots, Motion Control, Distributed Robot Systems
Abstract: Controlling soft continuum robotic arms is challenging due to their hyper-redundancy and dexterity. In this paper we experimentally demonstrate, for the first time, closed-loop control of the configuration space variables of a soft robotic arm, composed of independently controllable segments, using a Cosserat rod model of the robot and the distributed sensing and actuation capabilities of the segments. Our controller solves the inverse dynamic problem by simulating the Cosserat rod model in MATLAB using a computationally efficient numerical solution scheme, and it applies the computed control output to the actual robot in real time. The position and orientation of the tip of each segment are measured in real time, while the remaining unknown variables that are needed to solve the inverse dynamics are estimated simultaneously in the simulation. We implement the controller on a multi-segment silicone robotic arm with pneumatic actuation, using a motion capture system to measure the segments' positions and orientations. The controller is used to reshape the arm into configurations that are achieved through combinations of bending and extension deformations in 3D space. Although the possible deformations are limited for this robot platform, our study demonstrates the potential for implementing the control approach on a wide range of continuum robots in practice. The resulting tracking performance indicates the effectiveness of the controller and the accuracy of the simulated Cosserat rod model.
|
| |
| 09:42-09:48, Paper MoAT6.13 | Add to My Program |
| Closed Loop Static Control of Multi-Magnet Soft Continuum Robots |
|
| Pittiglio, Giovanni | Harvard University |
| Orekhov, Andrew | Carnegie Mellon University |
| da Veiga, Tomas | University of Leeds |
| Cal�, Simone | University of Leeds |
| Chandler, James Henry | University of Leeds |
| Simaan, Nabil | Vanderbilt University |
| Valdastri, Pietro | University of Leeds |
Keywords: Force Control, Medical Robots and Systems, Formal Methods in Robotics and Automation
Abstract: This paper discusses a novel static control approach applied to magnetic soft continuum robots (MSCRs). Our aim is to demonstrate the control of a multi-magnet soft continuum robot (SCR) in 3D. The proposed controller, based on a simplified yet accurate model of the robot, has a high update rate and is capable of real-time shape control. For the actuation of the MSCR, we employ the dual external permanent magnet (dEPM) platform and we sense the shape via fiber Bragg grating (FBG). The employed actuation system and sensing technique makes the proposed approach directly applicable to the medical context. We demonstrate that the proposed controller, running at approximately 300 Hz, is capable of shape tracking with a mean error of 8.5% and maximum error of 35.2% .We experimentally show that the static controller is 25.9% more accurate than a standard PID controller in shape tracking
|
| |
| MoAT7 Regular session, 258/259 |
Add to My Program |
| Cooperating Robots |
|
| |
| Chair: Krakow, Lucas | Texas A&M University |
| Co-Chair: Dantam, Neil | Colorado School of Mines |
| |
| 08:30-08:36, Paper MoAT7.1 | Add to My Program |
| IF-Based Trajectory Planning and Cooperative Control for Transportation System of Cable Suspended Payload with Multi UAVs |
|
| Zhang, Yu | Northeastern University, China |
| Xu, Jie | Northeastern University, China |
| Zhao, Cheng | Northeastern University, China |
| Dong, Jiuxiang | Northeastern University, China |
Keywords: Distributed Robot Systems, Multi-Robot Systems, Path Planning for Multiple Mobile Robots or Agents
Abstract: In this paper, we tackle the control and trajectory planning problems for the cooperative transportation system of cable-suspended payload with multi Unmanned Aerial Vehicles (UAVs). Firstly, a payload controller is presented considering the dynamic coupling between the UAV and the payload to accomplish the active suppression of payload swing and the complex payload trajectory tracking. Secondly, different from the simplification of obstacles in most approaches, we propose three Insetting Formation (IF) algorithms for the complete obstacle shape to generate collision-free waypoints for the cooperative transportation system. An IF strategy is proposed by integrating three IF algorithms to improve the success rate of obstacle avoidance and reduce the algorithm complexity for performing the aggressive flight. Finally, we verify the robustness and high performance of the proposed algorithm through benchmark comparison and real-world experiments. Moreover, our source code is released as an open-source ros package.
|
| |
| 08:36-08:42, Paper MoAT7.2 | Add to My Program |
| Cooperative Dual-Arm Control for Heavy Object Manipulation Based on Hierarchical Quadratic Programming |
|
| Dio, Maximilian | Friedrich-Alexander-Universit�t Erlangen-N�rnberg |
| V�lz, Andreas | Friedrich-Alexander-Universit�t Erlangen-N�rnberg |
| Graichen, Knut | Friedrich Alexander University Erlangen-N�rnberg |
Keywords: Cooperating Robots, Dual Arm Manipulation, Optimization and Optimal Control
Abstract: This paper presents a new control scheme for cooperative dual-arm robots manipulating heavy objects. The proposed method uses the full dynamical model of the kinematically coupled robot system and builds on a gls{hqp} formulation to enforce dynamical inequality constraints such as joint torques or internal loads. This ensures optimal tracking of an object trajectory, while additional objectives with lower priority are optimized on the prior solution space. Therefore, the redundancy of the inherent load distribution problem between the two arms can be eliminated. With this approach, higher object loads can be manipulated compared to non-optimized methods. Simulations with a 14~gls{dof} dual-arm robotic system demonstrate the effectiveness of the proposed control method. The real-time feasibility is guaranteed with an average computation time of less than 0.35 milliseconds at a control rate of 1 kilohertz.
|
| |
| 08:42-08:48, Paper MoAT7.3 | Add to My Program |
| Multi-UAV Adaptive Path Planning Using Deep Reinforcement Learning |
|
| Westheider, Jonas | University Bonn |
| R�ckin, Julius | University of Bonn |
| Popovic, Marija | University of Bonn |
Keywords: Path Planning for Multiple Mobile Robots or Agents, Reinforcement Learning, Cooperating Robots
Abstract: Efficient aerial data collection is important in many remote sensing applications. In large-scale monitoring scenarios, deploying a team of unmanned aerial vehicles (UAVs) offers improved spatial coverage and robustness against individual failures. However, a key challenge is cooperative path planning for the UAVs to efficiently achieve a joint mission goal. We propose a novel multi-agent informative path planning approach based on deep reinforcement learning for adaptive terrain monitoring scenarios using UAV teams. We introduce new network feature representations to effectively learn path planning in a 3D workspace. By leveraging a counterfactual baseline, our approach explicitly addresses credit assignment to learn cooperative behaviour. Our experimental evaluation shows improved planning performance, i.e. maps regions of interest more quickly, with respect to non-counterfactual variants. Results on synthetic and real-world data show that our approach has superior performance compared to state-of-the-art non-learning-based methods, while being transferable to varying team sizes and communication constraints.
|
| |
| 08:48-08:54, Paper MoAT7.4 | Add to My Program |
| Collective Intelligence for 2D Push Manipulations with Mobile Robots |
|
| Kuroki, So | The University of Tokyo |
| Matsushima, Tatsuya | The University of Tokyo |
| Jumpei, Arima | Matsuo Institute |
| Furuta, Hiroki | The University of Tokyo |
| Matsuo, Yutaka | The University of Tokyo |
| Gu, Shixiang Shane | OpenAI |
| Tang, Yujin | Google |
Keywords: Cooperating Robots, Mobile Manipulation, Imitation Learning
Abstract: While natural systems often present collective intelligence that allows them to self-organize and adapt to changes, the equivalent is missing in most artificial systems. We explore the possibility of such a system in the context of cooperative 2D push manipulations using mobile robots. Although conventional works demonstrate potential solutions for the problem in restricted settings, they have computational and learning difficulties. More importantly, these systems do not possess the ability to adapt when facing environmental changes. In this work, we show that by distilling a planner derived from a differentiable soft-body physics simulator into an attentionbased neural network, our multi-robot push manipulation system achieves better performance than baselines. In addition, our system also generalizes to configurations not seen during training and is able to adapt toward task completions when external turbulence and environmental changes are applied.
|
| |
| 08:54-09:00, Paper MoAT7.5 | Add to My Program |
| Emergent Cooperative Behavior in Distributed Target Tracking with Unknown Occlusions |
|
| Li, Tianqi | Texas A&M University |
| Krakow, Lucas | Texas A&M University |
| Gopalswamy, Swaminathan | Texas A&M University |
Keywords: Cooperating Robots, Reactive and Sensor-Based Planning, Behavior-Based Systems
Abstract: Tracking multiple moving objects of interest (OOI) with multiple robot systems (MRS) has been addressed by active sensing that maintains a shared belief of OOIs and plans the motion of robots to maximize the information quality. Mobility of robots enables the behavior of pursuing better visibility, which is constrained by sensor field of view (FoV) and occlusion objects. We first extend prior work to detect, maintain and share occlusion information explicitly, allowing us to generate occlusion-aware planning even if a priori semantic occlusion information is unavailable. The efficacy of active sensing approaches is often evaluated according to estimation error and information gain metrics. However, these metrics do not directly explain the level of cooperative behavior engendered by the active sensing algorithms. Next, we extract different emergent cooperative behaviors that stem from the same underlying algorithms but manifest differently under differing scenarios. In particular, we highlight and demonstrate three emergent behavior patterns in active sensing MRS: (i) Change of tracking responsibility between agents when tracking trajectories with divergent directions or due to a re-allocation of the resource among heterogeneous agents; (ii) Awareness of occlusions to a trajectory and temporal leave-and-return of the sensing agent; (iii) Sharing of local occlusion objects in MRS that subsequently improves the awareness of occlusion.
|
| |
| 09:00-09:06, Paper MoAT7.6 | Add to My Program |
| Multi-Objective Sparse Sensing with Ergodic Optimization |
|
| Rao, Ananya | Carnegie Mellon University |
| Choset, Howie | Carnegie Mellon University |
Keywords: Motion and Path Planning, Path Planning for Multiple Mobile Robots or Agents, Multi-Robot Systems
Abstract: We consider a search problem where a robot has one or more types of sensors, each suited to detecting different types of targets or target information. Often, information in the form of a distribution of possible target locations, or locations of interest, may be available to guide the search. When multiple types of information exist, then a distribution for each type of information must also exist, thereby making the search problem that uses these distributions to guide the search a multi-objective one. In this paper, we consider a multi-objective search problem when the �cost� to use a sensor is limited. To this end, we leverage the ergodic metric, which drives agents to spend time in regions proportional to the expected amount of information there. We define the multi-objective sparse sensing ergodic (MO-SS-E) metric in order to optimize when and where each sensor measurement should be taken while planning trajectories that balance the multiple objectives. We observe that our approach maintains coverage performance as the number of samples taken considerably degrades. Further empirical results on different multi-agent problem setups demonstrate the applicability of our approach for both homogeneous and heterogeneous multi-agent teams.
|
| |
| 09:06-09:12, Paper MoAT7.7 | Add to My Program |
| Team Coordination on Graphs with State-Dependent Edge Costs |
|
| Limbu, Manshi | George Mason University |
| Hu, Zechen | George Mason University |
| Oughourli, Sara | George Mason University |
| Wang, Xuan | George Mason University |
| Xiao, Xuesu | George Mason University |
| Shishika, Daigo | George Mason University |
Keywords: Planning, Scheduling and Coordination, Cooperating Robots, Multi-Robot Systems
Abstract: This paper studies a team coordination problem in a graph environment. Specifically, we incorporate �support� action which an agent can take to reduce the cost for its teammate to traverse some high cost edges. Due to this added feature, the graph traversal is no longer a standard multi-agent path planning problem. To solve this new problem, we propose a novel formulation that poses it as a planning problem in a joint state space: the joint state graph (JSG). Since the edges of JSG implicitly incorporate the support actions taken by the agents, we are able to now optimize the joint actions by solving a standard single-agent path planning problem in JSG. One main drawback of this approach is the curse of dimensionality in both the number of agents and the size of the graph. To improve scalability in graph size, we further propose a hierarchical decomposition method to perform path planning in two levels. We provide both theoretical and empirical complexity analyses to demonstrate the efficiency of our two algorithms.
|
| |
| 09:12-09:18, Paper MoAT7.8 | Add to My Program |
| Incorporating Stochastic Human Driving States in Cooperative Driving between a Human-Driven Vehicle and an Autonomous Vehicle |
|
| Hossain, Sanzida | Oklahoma State University |
| Lu, Jiaxing | Oklahoma State University |
| Bai, He | Oklahoma State University |
| Sheng, Weihua | Oklahoma State University |
Keywords: Cooperating Robots, Intelligent Transportation Systems, Human Factors and Human-in-the-Loop
Abstract: Modeling a human-driven vehicle is a difficult subject since human drivers have a variety of stochastic behavioral components that influence their driving styles. We develop a cooperative driving framework to incorporate different human behavior aspects, including the attentiveness of a driver and the tendency of the driver following advising commands. To demonstrate the framework, we consider the merging coordination between a human-driven vehicle and an autonomous vehicle (AV) in a connected environment. We propose a stochastic model predictive controller (sMPC) to address the stochasticity in human driving behavior and design coordinated merging actions to optimize the AV input and influence human driving behavior through advising commands. Simulation and human-in-the-loop (HITL) experimental results show that our formulation is capable of accommodating a distracted driver and optimizing AV inputs based on human driving behavior recognition.
|
| |
| 09:18-09:24, Paper MoAT7.9 | Add to My Program |
| Epistemic Planning for Heterogeneous Robotic Systems |
|
| Bramblett, Lauren | University of Virginia |
| Bezzo, Nicola | University of Virginia |
Keywords: Cooperating Robots, Path Planning for Multiple Mobile Robots or Agents, Task and Motion Planning
Abstract: In applications such as search and rescue or disaster relief, heterogeneous multi-robot systems (MRS) can provide significant advantages for complex objectives that require a suite of capabilities. However, within these application spaces, communication is often unreliable, causing inefficiencies or outright failures to arise in most MRS algorithms. Many researchers tackle this problem by requiring all robots to either maintain communication using proximity constraints or assuming that all robots will execute a predetermined plan over long periods of disconnection. The latter method allows for higher levels of efficiency in a MRS, but failures and environmental uncertainties can have cascading effects across the system, especially when a mission objective is complex or time-sensitive. To solve this, we propose an epistemic planning framework that allows robots to reason about the system state, leverage heterogeneous system makeups, and optimize information dissemination to disconnected neighbors. Dynamic epistemic logic formalizes the propagation of belief states, and epistemic task allocation and gossip is accomplished via a mixed integer program using the belief states for utility predictions and planning. The proposed framework is validated using simulations and experiments with heterogeneous vehicles.
|
| |
| 09:24-09:30, Paper MoAT7.10 | Add to My Program |
| Reinforced Potential Field for Multi-Robot Motion Planning in Cluttered Environments |
|
| Zhang, Dengyu | Sun Yat-Sen University |
| Zhang, Xinyu | Sun Yat-Sen University |
| Zhang, Zheng | Sun Yat-Sen University |
| Zhu, Bo | Sun Yat-Sen University |
| Zhang, Qingrui | Sun Yat-Sen University |
Keywords: Multi-Robot Systems, Motion and Path Planning, Collision Avoidance
Abstract: Motion planning is challenging for multiple robots in cluttered environments without communication, especially in view of real-time efficiency, motion safety, distributed computation, and trajectory optimality, etc. In this paper, a reinforced potential field method is developed for distributed multi-robot motion planning, which is a synthesized design of reinforcement learning and artificial potential fields. An observation embedding with a self-attention mechanism is presented to model the robot-robot and robot-environment interactions. A soft wall-following rule is developed to improve the trajectory smoothness. Our method belongs to reactive planning, but environment properties are implicitly encoded. The total amount of robots in our method can be scaled up to any number. The performance improvement over a vanilla APF and RL method has been demonstrated via numerical simulations. Experiments are also performed using quadrotors to further illustrate the competence of our method.
|
| |
| 09:30-09:36, Paper MoAT7.11 | Add to My Program |
| Robot Team Data Collection with Anywhere Communication |
|
| Schack, Matthew | Colorado School of Mines |
| Rogers III, John G. | US Army Research Laboratory |
| Han, Qi | Colorado School of Mines |
| Dantam, Neil | Colorado School of Mines |
Keywords: Multi-Robot Systems, Cooperating Robots, Path Planning for Multiple Mobile Robots or Agents
Abstract: Using robots to collect data is an effective way to obtain information from the environment and communicate it to a static base station. Furthermore, robots have the capability to communicate with one another, potentially decreasing the time for data to reach the base station. We present a Mixed Integer Linear Program that reasons about discrete routing choices, continuous robot paths, and their effect on the latency of the data collection task. We analyze our formulation, discuss optimization challenges inherent to the data collection problem, and propose a factored formulation that finds optimal answers more efficiently. Our work is able to find paths that reduce latency by up to 101% compared to treating all robots independently in our tested scenarios.
|
| |
| 09:36-09:42, Paper MoAT7.12 | Add to My Program |
| Coordination of Multiple Mobile Manipulators for Ordered Sorting of Cluttered Objects |
|
| Ahn, Jeeho | Korea University |
| Lee, Sebin | Sogang University |
| Nam, Changjoo | Sogang University |
Keywords: Cooperating Robots, Multi-Robot Systems, Manipulation Planning
Abstract: We present a coordination method for multiple mobile manipulators to sort objects in clutter. We consider the object rearrangement problem in which the objects must be sorted into different groups in a particular order. In clutter, the order constraints could not be easily satisfied since some objects occlude other objects so the occluded ones are not directly accessible to the robots. Those objects occluding others need to be moved more than once to make the occluded objects accessible. Such rearrangement problems fall into the class of nonmonotone rearrangement problems which are computationally intractable. While the nonmonotone problems with order constraints are harder, involving with multiple robots requires another computation for task allocation. In this work, we aim to develop a fast method, albeit suboptimally, for the multi-robot coordination for ordered sorting in clutter. The proposed method finds a sequence of objects to be sorted using a search such that the order constraint in each group is satisfied. The search can solve nonmonotone instances that require temporal relocation of some objects to access the next object to be sorted. Once a complete sorting sequence is found, the objects in the sequence are assigned to multiple mobile manipulators using a greedy task allocation method. We develop four versions of the method with different search strategies. In the experiments, we show that our method can find a sorting sequence quickly (e.g., 4.6 sec with 20 objects sorted into five groups) even though the solved instances include hard nonmonotone ones. The extensive tests and the experiments in simulation show the ability of the method to solve the real-world sorting problem using multiple mobile manipulators.
|
| |
| 09:42-09:48, Paper MoAT7.13 | Add to My Program |
| MOTLEE: Distributed Mobile Multi-Object Tracking with Localization Error Elimination |
|
| Peterson, Mason B. | Massachusetts Institute of Technology |
| Lusk, Parker C. | Massachusetts Institute of Technology |
| How, Jonathan | Massachusetts Institute of Technology |
Keywords: Distributed Robot Systems, Visual Tracking, Localization
Abstract: We present MOTLEE, a distributed mobile multi-object tracking algorithm that enables a team of robots to collaboratively track moving objects in the presence of localization error. Existing approaches to distributed tracking make limiting assumptions regarding the relative spatial relationship of sensors, including assuming a static sensor network or that perfect localization is available. Instead, we develop an algorithm based on the Kalman-Consensus filter for distributed tracking that properly leverages localization uncertainty in collaborative tracking. Further, our method allows the team to maintain an accurate understanding of dynamic objects in the environment by realigning robot frames and incorporating frame alignment uncertainty into our object tracking formulation. We evaluate our method in hardware on a team of three mobile ground robots tracking four people. Compared to previous works that do not account for localization error, we show that MOTLEE is resilient to localization uncertainties, enabling accurate tracking in distributed, dynamic settings with mobile tracking sensors.
|
| |
| MoAT8 Regular session, 141 |
Add to My Program |
| Legged Robots I |
|
| |
| Chair: Behnke, Sven | University of Bonn |
| Co-Chair: Semini, Claudio | Istituto Italiano Di Tecnologia |
| |
| 08:30-08:36, Paper MoAT8.1 | Add to My Program |
| Dynamic Object Tracking for Quadruped Manipulator with Spherical Image-Based Approach |
|
| Zhang, Tianlin | Harbin Institute of Technology |
| Guo, Sikai | Harbin Institute of Technology |
| Xiong, Xiaogang | Harbin Institute of Technology, Shenzhen |
| Li, Wanlei | Harbin Institute of Technology(ShenZhen) |
| Qi, Zezheng | Harbin Institute of Technology, Shenzhen |
| Lou, Yunjiang | Harbin Institute of Technology, Shenzhen |
Keywords: Legged Robots, Visual Servoing, Visual Tracking
Abstract: Exactly estimating and tracking the motion of surrounding dynamic objects is one of important tasks for the autonomy of a quadruped manipulator. However, with only an onboard RGB camera, it is still a challenging work for a quadruped manipulator to track the motion of a dynamic object moving with unknown and changing velocities. To address this problem, this manuscript proposes a novel image-based visual servoing (IBVS) approach consisting of three elements: a spherical projection model, a robust super-twisting observer, and a model predictive controller (MPC). The spherical projection model decouples the visual error of the dynamic target into linear and angular ones. Then, with the presence of the visual error, the robustness of the observer is exploited to estimate the unknown and changing velocities of the dynamic target without depth estimation. Finally, the estimated velocity is fed into the model predictive controller (MPC) to generate joint torques for the quadruped manipulator to track the motion of the dynamical target. The proposed approach is validated through hardware experiments and the experimental results illustrate the approach's effectiveness in improving the autonomy of the quadruped manipulator.
|
| |
| 08:36-08:42, Paper MoAT8.2 | Add to My Program |
| Proprioception and Tail Control Enable Extreme Terrain Traversal by Quadruped Robots |
|
| Yang, Yanhao | Oregon State University |
| Norby, Joseph | Apptronik |
| Yim, Justin K. | University of Illinois Urbana-Champaign |
| Johnson, Aaron M. | Carnegie Mellon University |
Keywords: Legged Robots, Biologically-Inspired Robots, Optimization and Optimal Control
Abstract: Legged robots leverage ground contacts and the reaction forces they provide to achieve agile locomotion. However, uncertainty coupled with contact discontinuities can lead to failure, especially in real-world environments with unexpected height variations such as rocky hills or curbs. To enable dynamic traversal of extreme terrain, this work introduces 1) a proprioception-based gait planner for estimating unknown hybrid events due to elevation changes and responding by modifying contact schedules and planned footholds online, and 2) a two-degree-of-freedom tail for improving contact-independent control and a corresponding decoupled control scheme for better versatility and efficiency. Simulation results show that the gait planner significantly improves stability under unforeseen terrain height changes compared to methods that assume fixed contact schedules and footholds. Further, tests have shown that the tail is particularly effective at maintaining stability when encountering a terrain change with an initial angular disturbance. The results show that these approaches work synergistically to stabilize locomotion with elevation changes up to 1.5 times the leg length and tilted initial states.
|
| |
| 08:42-08:48, Paper MoAT8.3 | Add to My Program |
| Run and Catch: Dynamic Object-Catching of Quadrupedal Robots |
|
| You, Yangwei | Institute for Infocomm Research |
| Liu, Tianlin | Peking University |
| Liang, Xiaowei | Beijing Xiaomi Mobile Software Co., Ltd |
| Xu, Zhe | Beijing Institute of Technology |
| Zhou, Mingliang | Beijing Xiaomi Mobile Software Co., Ltd |
| Li, Zhibin (Alex) | University College London |
| Zhang, Shiwu | University of Science and Technology of China |
Keywords: Legged Robots, Whole-Body Motion Planning and Control, Climbing Robots
Abstract: Quadrupedal robots are performing increasingly more real-world capabilities, but are primarily limited to locomotion tasks. To expand their task-level abilities of object acquisition, i.e., run-to-catch as frisbee catching for dogs, this paper developed a control pipeline using stereo vision for legged robots which allows for dynamic catching balls while the robot is in motion. To achieve high-frame-rate tracking, we designed a ball that can actively emit homogeneous infrared (IR) light and then located the flying ball based on binocular vision positioning using the onboard RealSense D450 camera with an additional IR bandpass filter. The camera was mounted on top of a 2-DoF head to gain a full view of the target ball. A state estimation module was developed to fuse the vision positioning, camera motor readings, localization result of RealSense T265 equipped on the back, and the legged odometry output altogether. With the use of a ballistic model, we achieved a robust estimation of both the ball and robot positions in an inertial coordinate. Additionally, we developed a close-loop catching strategy and employed trajectory prediction so that tracking and run-to-catch were performed simultaneously, which is critical for such drastically dynamic and precise tasks. The proposed approach was validated through both static testing and dynamic catch experiments conducted on the CyberDog robot with a high success rate.
|
| |
| 08:48-08:54, Paper MoAT8.4 | Add to My Program |
| A Composite Control Strategy for Quadruped Robot by Integrating Reinforcement Learning and Model-Based Control |
|
| Lyu, Shangke | Nanyang Technological University |
| Zhao, Han | Beijing University of Posts and Telecommunications |
| Wang, Donglin | Westlake University |
Keywords: Legged Robots, Motion Control, Reinforcement Learning
Abstract: Locomotion in the wild requires the quadruped robot to have strong capabilities in adaptation and robustness. The deep reinforcement learning (DRL) exhibits the huge potential in environmental adaptability, while its stability issues remain open. On the other hand, the quadruped robot dynamic model contains a lot of useful information that is beneficial to the robust control. The combination of DRL with model-based control may take both strengths and hold promises in better robustness. In this paper, the DRL and the proposed model-based controller are firmly integrated in a novel manner such that the proposed model-based controller is able to rectify the gait commands generated by DRL based on the system dynamic model so as to enhance the robustness of the quadruped robot against the external disturbances. Besides, a potential energy function is introduced to achieve the compliant contact. The stability of the proposed method is ensured in terms of passivity analysis. Several physical experiments are carried out to verify the performance of the proposed method.
|
| |
| 08:54-09:00, Paper MoAT8.5 | Add to My Program |
| Load Awareness: Sensorless Body Payload Sensing and Localization for Heavy Quadruped Robot |
|
| Liu, Shaoxun | Shanghai Jiao Tong University |
| Zhou, Shiyu | Shanghai Jiao Tong University |
| Pan, Zheng | Shanghai Jiao Tong University |
| Niu, Zhihua | Shanghai Jiao Tong University |
| Wang, Rongrong | Shanghai Jiao Tong University |
Keywords: Legged Robots, Contact Modeling, Dynamics
Abstract: Heavy quadrupedal drives have great potential for overcoming obstacles, showing great possibilities for transportation industries in complex environments. Ground reaction force (GRF) is a crucial state variable for quadrupedal control. Most GRF observations are implemented in lightweight quadrupeds, with little consideration of the loading being static or slippery on the body. However, the load information is vital to the heavy-duty quadruped applied in transportation tasks. In this paper, we disassembled the whole-body dynamics into the body dynamics combined with the individual floating single-leg dynamics and completed observing the virtual coupling effects between the body and legs. Based on the observed coupling force and centroidal dynamics (CD), the GRF of a stance leg is obtained without the awareness of body weight, movement, and load information. Furthermore, we utilized the body dynamics and the observed virtual force to obtain the body's unknown payload. By reconstructing the moment balance equation, we obtained the payload's position concerning the body coordinate. Compared to conventional quadrupedal GRF observation methods, this framework achieves higher observation accuracy in heavy quadrupeds without load and body information. Additionally, it enables real-time calculation of load magnitude and position.
|
| |
| 09:00-09:06, Paper MoAT8.6 | Add to My Program |
| Evolutionary-Based Online Motion Planning Framework for Quadruped Robot Jumping |
|
| Yue, Linzhu | The Chinese University of Hong Kong |
| Song, Zhitao | The Chinese University of Hong Kong |
| Zhang, Hongbo | The Chinese University of Hong Kong |
| Zhang, Lingwei | Hong Kong Centre for Logistics Robotics |
| Zeng, Xuanqi | Chinese University of Hong Kong |
| Liu, Yunhui | Chinese University of Hong Kong |
Keywords: Legged Robots, Whole-Body Motion Planning and Control, Motion and Path Planning
Abstract: Offline evolutionary-based methodologies have supplied a successful motion planning framework for the quadrupedal jump. However, the time-consuming computation caused by massive population evolution in offline evolutionary-based jumping framework significantly limits the popularity in the quadrupedal field. This paper presents a time-friendly online motion planning framework, based on meta-heuristic Differential evolution (DE), Latin hypercube sampling, and Configuration space (DLC). The DLC framework establishes a multidimensional optimization problem leveraging centroidal dynamics to determine the ideal trajectory of the center of mass (CoM) and ground reaction forces (GRFs). The configuration space is introduced to the evolutionary optimization in order to condense the searching region. Latin hypercube sampling offers more uniform initial populations of DE under limited sampling points, which accelerates away from a local minimum. This research also constructs a collection of pre-motion trajectories as a warm start, when the objective state is in the neighborhood of the pre-motion state, to drastically reduce the solving time. The proposed methodology is successfully validated via real robot experiments for online jumping trajectory optimization with different jumping motions (e.g., ordinary jumping, flipping, and spinning).
|
| |
| 09:06-09:12, Paper MoAT8.7 | Add to My Program |
| Multi-IMU Proprioceptive Odometry for Legged Robots |
|
| Yang, Shuo | Carnegie Mellon University |
| Zhang, Zixin | Carnegie Mellon University |
| Bokser, Benjamin | Boston Dynamics AI Institute |
| Manchester, Zachary | Carnegie Mellon University |
Keywords: Legged Robots, Sensor Fusion, Contact Modeling
Abstract: This paper presents a novel, low-cost proprioceptive sensing solution for legged robots with point feet to achieve accurate low-drift long-term position and velocity estimation. In addition to conventional sensors, including one body Inertial Measurement Unit (IMU) and joint encoders, we attach an additional IMU to each calf link of the robot just above the foot. An extended Kalman filter is used to fuse data from all sensors to estimate the robot's body and foot positions in the world frame. Using the additional IMUs, the filter is able to reliably determine foot contact modes and detect foot slips without tactile or pressure-based foot contact sensors. This sensing solution is validated in various hardware experiments, which confirm that it can reduce position drift by nearly an order of magnitude compared to conventional approaches with only a very modest increase in hardware and computational costs.
|
| |
| 09:12-09:18, Paper MoAT8.8 | Add to My Program |
| Design and Motion Guidelines for Quadrupedal Locomotion of Maximum Speed or Efficiency with Serial and Parallel Legs |
|
| Machairas, Konstantinos | National Technical University of Athens |
| Papadopoulos, Evangelos | National Technical University of Athens |
Keywords: Legged Robots, Task and Motion Planning, Mechanism Design
Abstract: Analytical expressions are derived for actuator demands in quadrupedal locomotion of constant speed and height by using a reduction from a trot/ pace 6-bar model to a single-legged model and employing two widely used two-segmented leg architectures, the serial and the parallel. A method is developed that outputs optimal gait characteristics and leg designs for a robot to move with maximum efficiency or speed. Also, generic guidelines are presented, which answer questions such as: which speed should be selected for maximum efficiency, or which is the optimal leg architecture (serial/ parallel) and leg length for maximum efficiency or speed.
|
| |
| 09:18-09:24, Paper MoAT8.9 | Add to My Program |
| Towards Legged Locomotion on Steep Planetary Terrain |
|
| Valsecchi, Giorgio | Robotic System Lab, ETH |
| Weibel, Cedric | ETH Zuerich |
| Kolvenbach, Hendrik | ETH Zurich |
| Hutter, Marco | ETH Zurich |
Keywords: Legged Robots, Space Robotics and Automation, Reinforcement Learning
Abstract: Scientific exploration of planetary bodies is an activity well-suited for robots. Unfortunately, the regions that are richer in potential discoveries, such as impact craters, caves, and vulcanic terraces, are hard to access with wheeled robots. Recent advances in legged-based approaches have shown the potential of the technology to overcome difficult terrains such as slopes and slippery surfaces. In this work, we focus on locomotion for sandy slopes, comparing baseline state-of-the-art walking policies with a novel crawling-based gait for quadrupedal robots. We fine-tuned a state-of-the-art locomotion framework and introduced hardware modifications to the robot ANYmal, which enables walking on its knees. Moreover, we integrated a novel metric for stability, the stability margin, in the training process to increase robustness in such conditions. We benchmarked the locomotion policies in simulation and in real-world experiments on martian soil simulant. Results show an improvement in locomotion performance and a more robust gait at higher slope angles.
|
| |
| 09:24-09:30, Paper MoAT8.10 | Add to My Program |
| Dynamic Hybrid Locomotion and Jumping for Wheeled-Legged Quadrupeds |
|
| Hosseini, Mojtaba | University of Bonn |
| Rodriguez, Diego | University of Bonn |
| Behnke, Sven | University of Bonn |
Keywords: Legged Robots, Wheeled Robots, Whole-Body Motion Planning and Control
Abstract: Hybrid wheeled-legged quadrupeds have the potential to navigate challenging terrain with agility and speed and over long distances. However, obstacles can impede their progress by requiring the robots to either slow down to step over obstacles or modify their path to circumvent the obstacles. We propose a motion optimization framework for quadruped robots that incorporates non-steerable wheels and dynamic jumps, enabling them to perform hybrid wheeled-legged locomotion while overcoming obstacles without slowing down. Our approach involves a model predictive controller that uses a time-varying rigid body dynamics model of the robot, including legs and wheels, to track dynamic motions such as jumping. We also introduce a method for driving with minimal leg swings to reduce energy consumption by sparing the effort involved in lifting the wheels. Our method was tested successfully on the wheeled Mini Cheetah and the Unitree AlienGo robots. Further videos and results are available at https://www.ais.uni-bonn.de/%7ehosseini/iros2023
|
| |
| 09:30-09:36, Paper MoAT8.11 | Add to My Program |
| Quadrupedal Footstep Planning Using Learned Motion Models of a Black-Box Controller |
|
| Taouil, Ilyass | Istituto Italiano Di Tecnologia |
| Turrisi, Giulio | Istituto Italiano Di Tecnologia |
| Schleich, Daniel | University of Bonn |
| Barasuol, Victor | Istituto Italiano Di Tecnologia |
| Semini, Claudio | Istituto Italiano Di Tecnologia |
| Behnke, Sven | University of Bonn |
Keywords: Legged Robots, Motion and Path Planning, Machine Learning for Robot Control
Abstract: Legged robots are increasingly entering new domains and applications, including search and rescue, inspection, and logistics. However, for such systems to be valuable in real-world scenarios, they must be able to autonomously and robustly navigate irregular terrains. In many cases, robots that are sold on the market do not provide such abilities, being able to perform only blind locomotion. Furthermore, their controller cannot be easily modified by the end-user, requiring a new and time-consuming control synthesis. In this work, we present a local motion planning pipeline that extends the capabilities of a black-box walking controller that is only able to track high-level reference velocities. More precisely, we learn a set of motion models for such controller that maps high-level velocity commands to Center of Mass (CoM) and footstep motions. We then integrate these models with a variant of the A* algorithm to plan the CoM trajectory, footstep sequences, and corresponding high-level velocity commands based on visual information, allowing the quadruped to safely traverse irregular terrains at demand.
|
| |
| 09:36-09:42, Paper MoAT8.12 | Add to My Program |
| An Efficient Paradigm for Feasibility Guarantees in Legged Locomotion (I) |
|
| Abdalla, Abdelrahman | Italian Institute of Technology |
| Focchi, Michele | Universit� Di Trento |
| Orsolino, Romeo | Arrival Ltd |
| Semini, Claudio | Istituto Italiano Di Tecnologia |
Keywords: Legged Robots, Dynamics, Kinematics, Motion and Path Planning
Abstract: Developing feasible body trajectories for legged systems on arbitrary terrains is a challenging task. In this article, we present a paradigm that allows to design feasible Center of Mass (CoM) and body trajectories in an efficient manner. In our previous work (Orsolino et al., 2020), we introduced the notion of the two-dimensional feasible region, where static balance and the satisfaction of joint-torque limits were guaranteed, whenever the projection of the CoM lied inside the proposed admissible region. In this work, we propose a general formulation of the improved feasible region that guarantees dynamic balance alongside the satisfaction of both joint-torque and kinematic limits in an efficient manner. To incorporate the feasibility of the kinematic limits, we introduce an algorithm that computes the reachable region of the CoM. Furthermore, we propose an efficient planning strategy that utilizes the improved feasible region to design feasible CoM and body orientation trajectories. Finally, we validate the capabilities of the improved feasible region and the effectiveness of the proposed planning strategy, using simulations and experiments on the 90 kg hydraulically actuated quadruped and the 21 kg Aliengo robots.
|
| |
| MoAT9 Regular session, 142ABC |
Add to My Program |
| Motion and Path Planning I |
|
| |
| Chair: Hovakimyan, Naira | University of Illinois at Urbana-Champaign |
| Co-Chair: Bezzo, Nicola | University of Virginia |
| |
| 08:30-08:36, Paper MoAT9.1 | Add to My Program |
| Locomotion Planning of a Truss Robot on Irregular Terrain |
|
| Bae, Jangho | University of Pennsylvania |
| Park, Inha | Hanyang University |
| Yim, Mark | University of Pennsylvania |
| Seo, TaeWon | Hanyang University |
Keywords: Cellular and Modular Robots, Motion and Path Planning
Abstract: This paper proposes a new locomotion algorithm for truss robots on irregular terrain, in particular, for the Variable Topology Truss (VTT) system. The previous Polygon-based Random Tree (PRT) search algorithm for support polygon generation is extended to irregular terrain while considering friction and internal force limitations. By characterizing terrain, unreachable areas are excluded from search to increase efficiency. A one-step rolling motion primitive is generated based on the kinematics, statics, and constraints of VTT. The locomotion planning is completed by transforming and connecting multiple motion primitives with respect to the desired support polygons. The algorithm�s performance is verified by conducting simulations in multiple types of environments.
|
| |
| 08:36-08:42, Paper MoAT9.2 | Add to My Program |
| A Model Predictive Path Integral Method for Fast, Proactive, and Uncertainty-Aware UAV Planning in Cluttered Environments |
|
| Higgins, Jacob | University of Virginia |
| Mohammad, Nicholas | University of Virginia |
| Bezzo, Nicola | University of Virginia |
Keywords: Motion and Path Planning, Autonomous Vehicle Navigation, Aerial Systems: Mechanics and Control
Abstract: Current motion planning approaches for autonomous mobile robots often assume that the low level controller of the system is able to track the planned motion with very high accuracy. In practice, however, tracking error can be affected by many factors, and could lead to potential collisions when the robot must traverse a cluttered environment. To address this problem, this paper proposes a novel receding-horizon motion planning approach based on Model Predictive Path Integral (MPPI) control theory -- a flexible sampling-based control technique that requires minimal assumptions on vehicle dynamics and cost functions. This flexibility is leveraged to propose a motion planning framework that also considers a data-informed risk function. Using the MPPI algorithm as a motion planner also reduces the number of samples required by the algorithm, relaxing the hardware requirements for implementation. The proposed approach is validated through trajectory generation for a quadrotor unmanned aerial vehicle (UAV), where fast motion increases trajectory tracking error and can lead to collisions with nearby obstacles. Simulations and hardware experiments demonstrate that the MPPI motion planner proactively adapts to the obstacles that the UAV must negotiate, slowing down when near obstacles and moving quickly when away from obstacles, resulting in a complete reduction of collisions while still producing lively motion.
|
| |
| 08:42-08:48, Paper MoAT9.3 | Add to My Program |
| Energy-Efficient Team Orienteering Problem in the Presence of Time-Varying Ocean Currents |
|
| Mansfield, Ariella | University of Pennsylvania |
| G. Macharet, Douglas | Universidade Federal De Minas Gerais |
| Hsieh, M. Ani | University of Pennsylvania |
Keywords: Task and Motion Planning, Multi-Robot Systems, Planning, Scheduling and Coordination
Abstract: Autonomous Marine Vehicles (AMVs) have gained interest for scientific and commercial applications, including pipeline and algae bloom monitoring, contaminant tracking, and ocean debris removal. The Team Orienteering Problem (TOP) is relevant in this context as Multi-Robot Systems (MRSs) allow for better coverage of the area of interest, simultaneous data collection at different locations, and an increase in the overall robustness and efficiency of the mission. However, route planning for AMVs in dynamic ocean environments is challenging due to the coupling of environmental and vehicle dynamics. We propose a multi-objective formulation that accounts for the trade-offs between visiting multiple task locations and energy consumption by the vehicles subject to a time budget. Different from existing approaches, our method is able to leverage time-varying ocean currents to improve the energy efficiency of resulting routes. We validate our approach experimentally by superimposing ocean flow models with benchmark instances of the TOP.
|
| |
| 08:48-08:54, Paper MoAT9.4 | Add to My Program |
| Multi-Agent Multi-Objective Ergodic Search Using Branch and Bound |
|
| Kesarimangalam Srinivasan, Akshaya | Carnegie Mellon University |
| Gutow, Geordan | Carnegie Mellon University |
| Ren, Zhongqiang | Carnegie Mellon University |
| Abraham, Ian | Yale University |
| Vundurthy, Bhaskar | Carnegie Mellon University |
| Choset, Howie | Carnegie Mellon University |
Keywords: Task and Motion Planning, Multi-Robot Systems, Path Planning for Multiple Mobile Robots or Agents
Abstract: Search and rescue applications often need multiple agents to complete a set of conflicting tasks. This paper studies a Multi-Agent Multi-Objective Ergodic Search (MA-MO-ES) approach to this problem where each objective or task is to cover a domain subject to an information map. The goal is to allocate tasks to agents so that all maps are covered ergodically. The combinatorial nature of task allocation makes it computationally expensive to solve optimally using brute force. Apart from a large number of possible allocations, computing the cost of a task allocation is itself a planning problem. To mitigate the computational challenge, we present a branch and bound-based algorithm with pruning techniques that reduce the number of allocations to be searched to find an optimal allocation. We also present an approach to leverage the similarity between information maps to further reduce computation. Extensive testing on 150 randomly generated test cases shows an order of magnitude improvement in runtime compared to an exhaustive brute force approach.
|
| |
| 08:54-09:00, Paper MoAT9.5 | Add to My Program |
| Leveraging Single-Goal Predictions to Improve the Efficiency of Multi-Goal Motion Planning with Dynamics |
|
| Lu, Yuanjie | George Mason University |
| Plaku, Erion | George Mason University |
Keywords: Motion and Path Planning, Nonholonomic Motion Planning
Abstract: Multi-goal motion planning requires a robot to plan collision-free and dynamically-feasible motions to reach multiple goals, often in unstructured, obstacle-rich environments. This is challenging due to the complex dependencies between navigation and high-level reasoning, requiring the robot to explore a vast space of feasible motions and goal sequences.Our approach combines machine learning and Traveling Salesman Problem (TSP) solvers with sampling-based motion planning. Machine learning predicts distances and directions between locations, considering obstacles and robot dynamics, which the TSP solver uses to compute promising tours. Sampling-based motion planning expands a motion tree to follow the tours along the predicted directions. We demonstrate the effectiveness of our approach through experiments with vehicle and snake-like robot models operating in unstructured environments with multiple goals.
|
| |
| 09:00-09:06, Paper MoAT9.6 | Add to My Program |
| DynGMP: Graph Neural Network-Based Motion Planning in Unpredictable Dynamic Environments |
|
| Zhang, Wenjin | Rutgers University |
| Zang, Xiao | Rutgers University |
| Huang, Lingyi | Rutgers University |
| Sui, Yang | Rutgers University |
| Yu, Jingjin | Rutgers University |
| Chen, Yingying | Rutgers University |
| Yuan, Bo | Rutgers University |
Keywords: Motion and Path Planning, Planning, Scheduling and Coordination, Deep Learning Methods
Abstract: Abstract� Neural networks have already demonstrated attractive performance for solving motion planning problems, especially in static and predictable environments. However, efficient neural planners that can adapt to unpredictable dynamic environments, a highly demanded scenario in many practical applications, are still under-explored. To fill this research gap and enrich the existing motion planning approaches, in this paper, we propose DynGMP, a graph neural network (GNN)-based planner that provides high-performance planning solutions in unpredictable dynamic environments. By fully leveraging the prior exploration experience and minimizing the replanning the cost incurred by environmental change, DynGMP achieves high planning performance and efficiency simultaneously. Empirical evaluations across different environments show that DynGMP can achieve close to 100% success rate with fast planning speed and short path cost. Compared with existing non-learning and learning-based counterparts, DynGMP shows very significant planning performance improvement, e.g., at least 2.7�, 2.2�, 2.4� and 2� faster planning speed with low path distance in four environments, respectively.
|
| |
| 09:06-09:12, Paper MoAT9.7 | Add to My Program |
| Symbolic State Space Optimization for Long Horizon Mobile Manipulation Planning |
|
| Zhang, Xiaohan | SUNY Binghamton |
| Zhu, Yifeng | The University of Texas at Austin |
| Ding, Yan | SUNY Binghamton |
| Jiang, Yuqian | University of Texas at Austin |
| Zhu, Yuke | The University of Texas at Austin |
| Stone, Peter | University of Texas at Austin |
| Zhang, Shiqi | SUNY Binghamton |
Keywords: Task and Motion Planning, Mobile Manipulation, Service Robotics
Abstract: In existing task and motion planning (TAMP) research, it is a common assumption that experts manually specify the state space for task-level planning. A well-developed state space enables the desirable distribution of limited computational resources between task planning and motion planning. However, developing such task-level state spaces can be non-trivial in practice. In this paper, we consider a long horizon mobile manipulation domain including repeated navigation and manipulation. We propose Symbolic State Space Optimization (S3O) for computing a set of abstracted locations and their 2D geometric groundings for generating task-motion plans in such domains. Our approach has been extensively evaluated in simulation and demonstrated on a real mobile manipulator working on clearing up dining tables. Results show the superiority of the proposed method over TAMP baselines in task completion rate and execution time.
|
| |
| 09:12-09:18, Paper MoAT9.8 | Add to My Program |
| A Fast and Map-Free Model for Trajectory Prediction in Traffics |
|
| Xiang, Junhong | Chongqing University |
| Zhang, Jingmin | No. 208 Research Institute of China Ordnance Industries |
| Nan, Zhixiong | Chongqing University |
Keywords: Motion and Path Planning, Autonomous Agents, Deep Learning Methods
Abstract: To handle the two shortcomings of existing methods, (i) nearly all models rely on high-definition (HD) maps, yet the map information is not always available in real traffic scenes and HD map-building is expensive and time-consuming and (ii) existing models usually focus on improving prediction accuracy at the expense of reducing computing efficiency, yet the efficiency is crucial for various real applications, this paper proposes an efficient trajectory prediction model that is not dependent on traffic maps. The core idea of our model is encoding single-agent's spatial-temporal information in the first stage and exploring multi-agents' spatial-temporal interactions in the second stage. By comprehensively utilizing attention mechanism, LSTM, graph convolution network and temporal transformer in the two stages, our model is able to learn rich dynamic and interaction information of all agents. Our model achieves the highest performance when comparing with existing map-free methods and also exceeds most map-based state-of-the-art methods on the Argoverse dataset. In addition, our model also exhibits a faster inference speed than the baseline methods.
|
| |
| 09:18-09:24, Paper MoAT9.9 | Add to My Program |
| Local Non-Cooperative Games with Principled Player Selection for Scalable Motion Planning |
|
| Chahine, Makram | Massachusetts Institute of Technology |
| Firoozi, Roya | Stanford University |
| Xiao, Wei | MIT |
| Schwager, Mac | Stanford University |
| Rus, Daniela | MIT |
Keywords: Motion and Path Planning, Multi-Robot Systems, Aerial Systems: Applications
Abstract: Game-theoretic motion planners are a powerful tool for the control of interactive multi-agent robot systems. Indeed, contrary to predict-then-plan paradigms, game-theoretic planners do not ignore the interactive nature of the problem, and simultaneously predict the behaviour of other agents while considering change in one�s policy. This, however, comes at the expense of computational complexity, especially as the number of agents considered grows. In fact, planning with more than a handful of agents can quickly become intractable, disqualifying game-theoretic planners as possible candidates for large scale planning. In this paper, we propose a planning algorithm enabling the use of game-theoretic planners in robot systems with a large number of agents. Our planner is based on the reality of locality of information and thus deploys local games with a selected subset of agents in a receding horizon fashion to plan collision avoiding trajectories. We propose five different principled schemes for selecting game participants and compare their collision avoidance performance. We observe that the use of Control Barrier Functions for priority ranking is a potent solution to the player selection problem for motion planning.
|
| |
| 09:24-09:30, Paper MoAT9.10 | Add to My Program |
| Target Attribute Perception Based UAV Real-Time Task Planning in Dynamic Environments |
|
| He, Jinhong | Huazhong University of Science and Technology |
| Sun, Zheyu | Huazhong University of Science and Technology |
| Ming, Delie | Huazhong University of Science and Technology |
| Cai, Chao | Huazhong University of Science and Technology |
| Cao, Ningbo | Huazhong University of Science and Technology |
Keywords: Motion and Path Planning, Computer Vision for Automation, Deep Learning for Visual Perception
Abstract: In this paper, a comprehensive solution for enabling unmanned aerial vehicle (UAV) to autonomously fly through complex and dynamic environments is proposed. Moving objects all have unique property information, we propose a method that utilizes deep learning for 3D dynamic environment perception, while taking into account limitations in computing resources. For safer dynamic avoidance, we first model the dynamic target and integrate it into a static grid occupancy map, and then construct a gradient field based on its attribute information. To achieve autonomous UAV flight in dynamic environments, we design an adaptive planning method based on gradient optimisation, which achieves significant computational savings by autonomously adjusting the planning frequency and using manually constructed gradients instead of maintaining a signed distance field (SDF). We have integrated the above approach into a customised quadrotor system and thoroughly tested it in real-world, verifying its flexibility to handle multiple objects with variable speed motion in complex enviroment.
|
| |
| 09:30-09:36, Paper MoAT9.11 | Add to My Program |
| Simultaneous Spatial and Temporal Assignment for Fast UAV Trajectory Optimization Using Bilevel Optimization |
|
| Chen, Qianzhong | University of Illinois Urbana-Champaign |
| Cheng, Sheng | University of Illinois Urbana-Champaign |
| Hovakimyan, Naira | University of Illinois at Urbana-Champaign |
Keywords: Constrained Motion Planning, Aerial Systems: Applications, Optimization and Optimal Control
Abstract: In this paper, we propose a framework for fast trajectory planning for unmanned aerial vehicles (UAVs). Our framework is reformulated from an existing bilevel optimization, in which the lower-level problem solves for the optimal trajectory with a fixed time allocation, whereas the upper-level problem updates the time allocation using analytical gradients. The lower-level problem incorporates the safety-set constraints (in the form of inequality constraints) and is cast as a convex quadratic program (QP). Our formulation modifies the lower-level QP by excluding the inequality constraints for the safety sets, which significantly reduces the computation time. The safety-set constraints are moved to the upper-level problem, where the feasible waypoints are updated together with the time allocation using analytical gradients enabled by the OptNet. We validate our approach in simulations, where our method's computation time scales linearly with respect to the number of safety sets, in contrast to the state-of-the-art that scales exponentially.
|
| |
| 09:36-09:42, Paper MoAT9.12 | Add to My Program |
| A Non-Prehensile Object Transportation Framework with Adaptive Tilting Based on Quadratic Programming |
|
| Subburaman, Rajesh | University of Naples Federico II |
| Selvaggio, Mario | Universit� Degli Studi Di Napoli Federico II |
| Ruggiero, Fabio | Universit� Di Napoli Federico II |
Keywords: Dexterous Manipulation, Optimization and Optimal Control, Intelligent Transportation Systems
Abstract: This work proposes an operational space control framework for non-prehensile object transportation using a robot arm. The control actions for the manipulator are computed by solving a quadratic programming (QP) problem considering the object's and manipulator's kinematic and dynamic constraints. Given the desired transportation trajectory, the proposed controller generates control commands for the robot to achieve the desired motion whilst preventing object slippage. In particular, the controller minimizes the occurrence of object slippage by adaptively regulating the tray orientation. The proposed approach has been extensively evaluated numerically with a 7-degree-of-freedom manipulator, and it is also verified and validated with a real experimental setup.
|
| |
| 09:42-09:48, Paper MoAT9.13 | Add to My Program |
| Dynamic Optimization Fabrics for Motion Generation (I) |
|
| Spahn, Max | TU Delft |
| Wisse, Martijn | Delft University of Technology |
| Alonso-Mora, Javier | Delft University of Technology |
Keywords: Mobile Manipulation, Nonholonomic Motion Planning, Motion Control of Manipulators, Geometric Control
Abstract: Optimization fabrics are a geometric approach to real-time local motion generation, where motions are designed by the composition of several differential equations that exhibit a desired motion behavior. We generalize this framework to dynamic scenarios and non-holonomic robots and prove that fundamental properties can be conserved. We show that convergence to desired trajectories and avoidance of moving obstacles can be guaranteed using simple construction rules of the components. Additionally, we present the first quantitative comparisons between optimization fabrics and model predictive control and show that optimization fabrics can generate similar trajectories with better scalability, and thus, much higher replanning frequency (up to 500 Hz with a 7 degrees of freedom robotic arm). Finally, we present empirical results on several robots, including a non-holonomic mobile manipulator with 10 degrees of freedom and avoidance of a moving human, supporting the theoretical findings.
|
| |
| MoAT10 Regular session, 250ABC |
Add to My Program |
| Learning for Manipulation I |
|
| |
| Chair: Lou, Xibai | University of Minnesota Twin Cities |
| Co-Chair: Garcia, Ricardo | Inria |
| |
| 08:30-08:36, Paper MoAT10.1 | Add to My Program |
| Foldsformer: Learning Sequential Multi-Step Cloth Manipulation with Space-Time Attention |
|
| Mo, Kai | Tsinghua University, Shenzhen International Graduate School |
| Xia, Chongkun | Tsinghua University |
| Wang, Xueqian | Center for Artificial Intelligence and Robotics, Graduate School |
| Deng, Yuhong | Tsinghua Univerisity |
| Gao, Xue-Hai | Tsinghua University |
| Liang, Bin | Tsinghua University |
Keywords: Deep Learning in Grasping and Manipulation, Perception-Action Coupling
Abstract: Sequential multi-step cloth manipulation is a challenging problem in robotic manipulation, requiring a robot to perceive the cloth state and plan a sequence of chained actions leading to the desired state. Most previous works address this problem in a goal-conditioned way, and goal observation must be given for each specific task and cloth configuration, which is not practical and efficient. Thus, we present a novel multi-step cloth manipulation planning framework named Foldformer. Foldformer can complete similar tasks with only a general demonstration and utilize a space-time attention mechanism to capture the instruction information behind this demonstration. We experimentally evaluate Foldsformer on four representative sequential multi-step manipulation tasks and show that Foldsformer significantly outperforms state-of-the-art approaches in simulation. Foldformer can complete multi-step cloth manipulation tasks even when configurations of the cloth (e.g., size and pose) vary from configurations in the general demonstrations. Furthermore, our approach can be transferred from simulation to the real world without additional training or domain randomization. Despite training on rectangular clothes, we also show that our approach can generalize to unseen cloth shapes (T-shirts and shorts). Videos are available at https://sites.google.com/view/foldsformer.
|
| |
| 08:36-08:42, Paper MoAT10.2 | Add to My Program |
| GraNet: A Multi-Level Graph Network for 6-DoF Grasp Pose Generation in Cluttered Scenes |
|
| Wang, Haowen | Shanghai Jiao Tong University |
| Niu, Wanhao | Shanghai Jiao Tong University |
| Zhuang, Chungang | Shanghai Jiao Tong University |
Keywords: Deep Learning in Grasping and Manipulation, Perception for Grasping and Manipulation, Computer Vision for Automation
Abstract: 6-DoF object-agnostic grasping in unstructured environments is a critical yet challenging task in robotics. Most current works use non-optimized approaches to sample grasp locations and learn spatial features without concerning the grasping task. This paper proposes GraNet, a graph-based grasp pose generation framework that translates a point cloud scene into multi-level graphs and propagates features through graph neural networks. By building graphs at the scene level, object level, and grasp point level, GraNet enhances feature embedding at multiple scales while progressively converging to the ideal grasping locations by learning. Our pipeline can thus characterize the spatial distribution of grasps in cluttered scenes, leading to a higher rate of effective grasping. Furthermore, we enhance the representation ability of scalable graph networks by a structure-aware attention mechanism to exploit local relations in graphs. Our method achieves state-of-the-art performance on the large-scale GraspNet-1Billion benchmark, especially in grasping unseen objects (+11.62 AP). The real robot experiment shows a high success rate in grasping scattered objects, verifying the effectiveness of the proposed approach in unstructured environments.
|
| |
| 08:42-08:48, Paper MoAT10.3 | Add to My Program |
| Modular Neural Network Policies for Learning In-Flight Object Catching with a Robot Hand-Arm System |
|
| Hu, Wenbin | University of Edinburgh |
| Acero, Fernando | University of Edinburgh |
| Triantafyllidis, Eleftherios | The University of Edinburgh |
| Liu, Zhaocheng | The University of Edinburgh |
| Li, Zhibin (Alex) | University College London |
Keywords: Deep Learning in Grasping and Manipulation, Reinforcement Learning, Perception-Action Coupling
Abstract: We present a modular framework designed to enable a robot hand-arm system to learn how to catch flying objects, a task that requires fast, reactive, and accurately-timed robot motions. Our framework consists of five core modules: (i) an object state estimator that learns object trajectory prediction, (ii) a catching pose quality network that learns to score and rank object poses for catching, (iii) a reaching control policy trained to move the robot hand to pre-catch poses, (iv) a grasping control policy trained to perform soft catching motions for safe and robust grasping, and (v) a gating network trained to synthesize the actions given by the reaching and grasping policy. The former two modules are trained via supervised learning and the latter three use deep reinforcement learning in a simulated environment. We conduct extensive evaluations of our framework in simulation for each module and the integrated system, to demonstrate high success rates of in-flight catching and robustness to perturbations and sensory noise. Whilst only simple cylindrical and spherical objects are used for training, the integrated system shows successful generalization to a variety of household objects that are not used in training.
|
| |
| 08:48-08:54, Paper MoAT10.4 | Add to My Program |
| GVCCI: Lifelong Learning of Visual Grounding for Language-Guided Robotic Manipulation |
|
| Kim, Junghyun | Seoul National University |
| Kang, Gi-Cheon | Seoul National University |
| Kim, Jaein | Seoul National University |
| Shin, Suyeon | Seoul National University |
| Zhang, Byoung-Tak | Seoul National University |
Keywords: Multi-Modal Perception for HRI, Deep Learning Methods, Autonomous Agents
Abstract: Language-Guided Robotic Manipulation (LGRM) is a challenging task as it requires a robot to understand human instructions to manipulate everyday objects. Recent approaches in LGRM rely on pre-trained Visual Grounding (VG) models to detect objects without adapting to manipulation environments. This results in a performance drop due to a substantial domain gap between the pre-training and real-world data. A straightforward solution is to collect additional training data, but the cost of human-annotation is extortionate. In this paper, we propose Grounding Vision to Ceaselessly Created Instructions (GVCCI), a lifelong learning framework for LGRM, which continuously learns VG without human supervision. GVCCI iteratively generates synthetic instruction via object detection and trains the VG model with the generated data. We validate our framework in offline and online settings across diverse environments on different VG models. Experimental results show that accumulating synthetic data from GVCCI leads to a steady improvement in VG by up to 56.7% and improves resultant LGRM by up to 29.4%. Furthermore, the qualitative analysis shows that the unadapted VG model often fails to find correct objects due to a strong bias learned from the pre-training data. Finally, we introduce a novel VG dataset for LGRM, consisting of nearly 252k triplets of image-object-instruction from diverse manipulation environments.
|
| |
| 08:54-09:00, Paper MoAT10.5 | Add to My Program |
| Bag All You Need: Learning a Generalizable Bagging Strategy for Heterogeneous Objects |
|
| Bahety, Arpit | Columbia University |
| Jain, Shreeya | Columbia University |
| Ha, Huy | Columbia University |
| Hager, Nathalie | Columbia University |
| Burchfiel, Benjamin | Toyota Research Institute |
| Cousineau, Eric | Toyota Research Institute |
| Feng, Siyuan | Toyota Research Institute |
| Song, Shuran | Columbia University |
Keywords: Deep Learning in Grasping and Manipulation, Manipulation Planning, Service Robotics
Abstract: We introduce a practical robotics solution for the task of heterogeneous bagging, requiring the placement of multiple rigid and deformable objects into a deformable bag. This is a difficult task as it features complex interactions between multiple highly deformable objects under limited observability. To tackle these challenges, we propose a robotic system consisting of two learned policies: a rearrangement policy that learns to place multiple rigid objects and fold deformable objects in order to achieve desirable pre-bagging conditions, and a lifting policy to infer suitable grasp points for bi-manual bag lifting. We evaluate these learned policies on a real-world three-arm robot platform that achieves a 70% heterogeneous bagging success rate with novel objects. To facilitate future research and comparison, we also develop a novel heterogeneous bagging simulation benchmark that will be made publicly available.
|
| |
| 09:00-09:06, Paper MoAT10.6 | Add to My Program |
| Multi-Source Fusion for Voxel-Based 7-DoF Grasping Pose Estimation |
|
| Qiu, Junning | Xi'an Jiaotong University |
| Wang, Fei | Xi'an Jiaotong University |
| Dang, Zheng | EPFL |
Keywords: Deep Learning in Grasping and Manipulation, Visual Learning, Deep Learning Methods
Abstract: In this work, we tackle the problem of 7-DoF grasping pose estimation(6-DoF with the opening width of parallel-jaw gripper) from point cloud data, which is a fundamental task in robotic manipulation. Most existing methods adopt 3D voxel CNNs as the backbone for their efficiency in handling unordered point cloud data. However, we found that these approaches overlook detailed information of the point clouds, resulting in decreased performance. Through our analysis, we identified quantization loss and boundary information loss within 3D convolutional layers as the primary causes of this issue. To address these challenges, we introduced two novel branches: one adds an extra positional encoding operation to preserve details and unique features for each point, and the other uses a 2D CNN to operate on the rangebased image, which better aggregates boundary information on a continuous 2D domain. To integrate these branches with the original branch, we introduced a novel multi-source fusion gated mechanism to aggregate features. Our approach achieved state-of-the-art performance on the Graspnet-1Billion benchmark and demonstrated high success rates in real robotic experiments across different scenes. Our work has the potential to improve the performance of robotic grasping systems and contribute to the field of robotics.
|
| |
| 09:06-09:12, Paper MoAT10.7 | Add to My Program |
| VL-Grasp: A 6-Dof Interactive Grasp Policy for Language-Oriented Objects in Cluttered Indoor Scenes |
|
| Lu, Yuhao | Tsinghua University |
| Fan, Yixuan | Tsinghua University |
| Deng, Beixing | Tsinghua University |
| Liu, Fangfu | Tsinghua University |
| Li, Yali | Tsinghua University |
| Wang, Shengjin | Tsinghua University |
Keywords: Deep Learning in Grasping and Manipulation, Multi-Modal Perception for HRI, Data Sets for Robotic Vision
Abstract: Robotic grasping faces new challenges in human-robot-interaction scenarios. We consider the task that the robot grasps a target object designated by human's language directives. The robot not only needs to locate a target based on vision-and-language information, but also needs to predict the reasonable grasp pose candidate at various views and postures. In this work, we propose a novel interactive grasp policy, named Visual-Lingual-Grasp (VL-Grasp), to grasp the target specified by human language. First, we build a new challenging visual grounding dataset to provide functional training data for robotic interactive perception in indoor environments. Second, we propose a 6-Dof interactive grasp policy combined with visual grounding and 6-Dof grasp pose detection to extend the universality of interactive grasping. Third, we design a grasp pose filter module to enhance the performance of the policy. Experiments demonstrate the effectiveness and extendibility of the VL-Grasp in real world. The VL-Grasp achieves a success rate of 72.5% in different indoor scenes. The code and dataset is available at https://github.com/luyh20/VL-Grasp.
|
| |
| 09:12-09:18, Paper MoAT10.8 | Add to My Program |
| QDP: Learning to Sequentially Optimise Quasi-Static and Dynamic Manipulation Primitives for Robotic Cloth Manipulation |
|
| Blanco-Mulero, David | Aalto University |
| Alcan, Gokhan | Aalto University |
| Abu-Dakka, Fares | Technische Universit�t M�nchen |
| Kyrki, Ville | Aalto University |
Keywords: Deep Learning in Grasping and Manipulation, Reinforcement Learning, Manipulation Planning
Abstract: Pre-defined manipulation primitives are widely used for cloth manipulation. However, cloth properties such as its stiffness or density can highly impact the performance of these primitives. Although existing solutions have tackled the parameterisation of pick and place locations, the effect of factors such as the velocity or trajectory of quasi-static and dynamic manipulation primitives has been neglected. Choosing appropriate values for these parameters is crucial to cope with the range of materials present in house-hold cloth objects. To address this challenge, we introduce the Quasi-Dynamic Parameterisable (QDP) method, which optimises parameters such as the motion velocity in addition to the pick and place positions of quasi-static and dynamic manipulation primitives. In this work, we leverage the framework of Sequential Reinforcement Learning to decouple sequentially the parameters that compose the primitives. To evaluate the effectiveness of the method, we focus on the task of cloth unfolding with a robotic arm in simulation and real-world experiments. Our results in simulation show that by deciding the optimal parameters for the primitives the performance can improve by 20% compared to sub-optimal ones. Real-world results demonstrate the advantage of modifying the velocity and height of manipulation primitives for cloths with different mass, stiffness, shape, and size. Supplementary material, videos, and code, can be found at https://sites.google.com/view/qdp-srl.
|
| |
| 09:18-09:24, Paper MoAT10.9 | Add to My Program |
| Robust Visual Sim-To-Real Transfer for Robotic Manipulation |
|
| Garcia, Ricardo | Inria |
| Strudel, Robin | INRIA Paris |
| Chen, Shizhe | Inria |
| Arlaud, Etienne | INRIA |
| Laptev, Ivan | INRIA |
| Schmid, Cordelia | Inria |
Keywords: Deep Learning in Grasping and Manipulation, Learning from Demonstration, Transfer Learning
Abstract: Learning visuomotor policies in simulation is much safer and cheaper than in the real world. However, due to discrepancies between the simulated and real data, simulator-trained policies often fail when transferred to real robots. One common approach to bridge the visual sim-to-real domain gap is domain randomization (DR). While previous work mainly evaluates DR for disembodied tasks, such as pose estimation and object detection, here we systematically explore visual domain randomization methods and benchmark them on a rich set of challenging robotic manipulation tasks. In particular, we propose an off-line proxy task of cube localization to select DR parameters for texture randomization, lighting randomization, variations of object colors and camera parameters. Notably, we demonstrate that DR parameters have similar impact on our off-line proxy task and on-line policies. We, hence, use off-line optimized DR parameters to train visuomotor policies in simulation and directly apply such policies to a real robot. Our approach achieves 93% success rate on average when tested on a diverse set of challenging manipulation tasks. Moreover, we evaluate the robustness of policies to visual variations in real scenes and show that our simulator-trained policies outperform policies learned using real but limited data. Code, simulation environment, real robot datasets and trained models are available at https://www.di.ens.fr/willow/research/robust_s2r/.
|
| |
| 09:24-09:30, Paper MoAT10.10 | Add to My Program |
| Multi-Dimensional Deformable Object Manipulation Using Equivariant Models |
|
| Fu, Tianyu | East China University of Science and Technology |
| Tang, Yang | East China University of Science and Technology |
| Wu, Tianyu | East China University of Science and Technology |
| Xia, Xiaowu | East China University of Science and Technology |
| Wang, Jianrui | East China University of Science and Technology |
| Zhao, Chaoqiang | East China University of Science and Technology |
Keywords: Deep Learning in Grasping and Manipulation, Learning from Demonstration, Imitation Learning
Abstract: Manipulating deformable objects, such as ropes (1D), fabrics (2D), and bags (3D), is a very challenging problem in robotic research since the deformable object has a high degree of freedom in the physical state and nonlinear dynamics. Compared with single-dimensional deformable objects, multi-dimensional object manipulation suffers from the difficulty in recognizing the characteristics of the object correctly and making an accurate action decision on the deformable object of various dimensions.Some methods are proposed to use neural networks to rearrange deformable objects in all dimensions, but their approaches are not accurate in predicting the motion of the robot as they just consider the equivariance in the picking objects. To address this problem, we present a novel Transporter Network encoded and decoded with equivariance to generalize to different picking and placing positions. Additionally, we propose an equivariant goal-conditioned model to enable the robot to manipulate deformable objects into flexible configurations without artificially marked visual anchors for the target position. Finally, experiments in Deformable-Ravens and the real world demonstrate that our equivariant models are more sample efficient than the traditional Transporter Network. The video is available at https://youtu.be/SH4aV2f0wt0.
|
| |
| 09:30-09:36, Paper MoAT10.11 | Add to My Program |
| Adversarial Object Rearrangement in Constrained Environments with Heterogeneous Graph Neural Networks |
|
| Lou, Xibai | University of Minnesota Twin Cities |
| Yu, Houjian | University of Minnesota, Twin Cities |
| Worobel, Ross | University of Minnesota |
| Yang, Yang | University of Minnesota |
| Choi, Changhyun | University of Minnesota, Twin Cities |
Keywords: Deep Learning in Grasping and Manipulation, Deep Learning for Visual Perception, Task and Motion Planning
Abstract: Adversarial object rearrangement in the real world (e.g., previously unseen or oversized items in kitchens and stores) could benefit from understanding task scenes, which inherently entail heterogeneous components such as current objects, goal objects, and environmental constraints. The semantic relationships among these components are distinct from each other and crucial for multi-skilled robots to perform efficiently in everyday scenarios. We propose a hierarchical robotic manipulation system that learns the underlying relationships and maximizes the collaborative power of its diverse skills (e.g., pick-place, push) for rearranging adversarial objects in constrained environments. The high-level coordinator employs a heterogeneous graph neural network (HetGNN), which reasons about the current objects, goal objects, and environmental constraints; the low-level 3D Convolutional Neural Network-based actors execute the action primitives. Our approach is trained entirely in simulation, and achieved an average success rate of 87.88% and a planning cost of 12.82 in real-world experiments, surpassing all baseline methods. Supplementary material is available at https://sites.google.com/umn.edu/versatile-rearrangement.
|
| |
| 09:36-09:42, Paper MoAT10.12 | Add to My Program |
| Probabilistic Slide-Support Manipulation Planning in Clutter |
|
| Shusei, Nagato | Osaka University |
| Motoda, Tomohiro | National Institute of Advanced Industrial Science and Technology |
| Nishi, Takao | Osaka University |
| Petit, Damien | Osaka University |
| Kiyokawa, Takuya | Osaka University |
| Wan, Weiwei | Osaka University |
| Harada, Kensuke | Osaka University |
Keywords: Deep Learning in Grasping and Manipulation, Bimanual Manipulation, Manipulation Planning
Abstract: To safely and efficiently extract an object from the clutter, this paper presents a bimanual manipulation planner in which one hand of the robot is used to slide the target object out of the clutter while the other hand is used to support the surrounding objects to prevent the clutter from collapsing. Our method uses a neural network to predict the physical phenomena of the clutter when the target object is moved. We generate the most efficient action based on the Monte Carlo tree search.The grasping and sliding actions are planned to minimize the number of motion sequences to pick the target object. In addition, the object to be supported is determined to minimize the position change of surrounding objects. Experiments with a real bimanual robot confirmed that the robot could retrieve the target object, reducing the total number of motion sequences and improving safety.
|
| |
| 09:42-09:48, Paper MoAT10.13 | Add to My Program |
| GOATS: Goal Sampling Adaptation for Scooping with Curriculum Reinforcement Learning |
|
| Niu, Yaru | Carnegie Mellon University |
| Jin, Shiyu | Baidu |
| Zhang, Zeqing | The University of Hong Kong |
| Zhu, Jiacheng | Carnegie Mellon University |
| Zhao, Ding | Carnegie Mellon University |
| Zhang, Liangjun | Baidu |
Keywords: Deep Learning in Grasping and Manipulation, Reinforcement Learning
Abstract: In this work, we first formulate the problem of robotic water scooping using goal-conditioned reinforcement learning. This task is particularly challenging due to the complex dynamics of fluid and the need to achieve multi-modal goals. The policy is required to successfully reach both position goals and water amount goals, which leads to a large convoluted goal state space. To overcome these challenges, we introduce Goal Sampling Adaptation for Scooping (GOATS), a curriculum reinforcement learning method that can learn an effective and generalizable policy for robot scooping tasks. Specifically, we use a goal-factorized reward formulation and interpolate position goal distributions and amount goal distributions to create curriculum throughout the learning process. As a result, our proposed method can outperform the baselines in simulation and achieves 5.46% and 8.71% amount errors on bowl scooping and bucket scooping tasks, respectively, under 1000 variations of initial water states in the tank and a large goal state space. Besides being effective in simulation environments, our method can efficiently adapt to noisy real-robot water-scooping scenarios with diverse physical configurations and unseen settings, demonstrating superior efficacy and generalizability. The videos of this work are available on our project page: https://sites.google.com/view/goatscooping.
|
| |
| MoAT11 Regular session, 251ABC |
Add to My Program |
| Aerial Systems - Applications I |
|
| |
| Chair: Min, Byung-Cheol | Purdue University |
| Co-Chair: Lee, Jongseok | German Aerospace Center |
| |
| 08:30-08:36, Paper MoAT11.1 | Add to My Program |
| Auto Filmer: Autonomous Aerial Videography under Human Interaction |
|
| Zhang, Zhiwei | Zhejiang University |
| Zhong, Yuhang | NanKai Unviersity |
| Guo, Junlong | Zhejiang University |
| Wang, Qianhao | Zhejiang University |
| Xu, Chao | Zhejiang University |
| Gao, Fei | Zhejiang University |
Keywords: Aerial Systems: Applications, Human-Aware Motion Planning, Aerial Systems: Perception and Autonomy
Abstract: The advance of unmanned aerial vehicles (UAVs) has enabled customers and directors to film from the air. However, operating the drone to produce desired videos upon a moving object is hard to achieve. This letter proposes an autonomous aerial videography system that integrates customized shots and drone dynamics. We design a user-friendly interface for the operator to create the desired shot in real-time. The shot information is then transmitted to the kinodynamic path search process, in which a safe shooting path will be evaluated. Later, feasible regions and safe flight corridors are constructed for safety and visibility. Finally, a joint optimization is carried out to generate the trajectory of the quadrotor and the gimbal to maintain the required image composition. Extensive simulation and real-world experiments validate the effectiveness of our method.
|
| |
| 08:36-08:42, Paper MoAT11.2 | Add to My Program |
| New Era in Cultural Heritage Preservation: Cooperative Aerial Autonomy for Fast Digitalization of Difficult-To-Access Interiors of Historical Monuments (I) |
|
| Petr�ček, Pavel | Czech Technical University in Prague |
| Kr�tk�, V�t | Czech Technical University in Prague |
| Baca, Tomas | Ceske Vysoke Uceni Technicke V Praze, FEL |
| Petrlik, Matej | Czech Technical University in Prague, Faculty of Electrical Engi |
| Saska, Martin | Czech Technical University in Prague |
Keywords: Aerial Systems: Applications, Aerial Systems: Perception and Autonomy, Multi-Robot Systems
Abstract: Digital documentation of large interiors of historical buildings is an exhausting task since most of the areas of interest are beyond typical human reach. We advocate the use of autonomous teams of multi-rotor Unmanned Aerial Vehicles (UAVs) to speed up the documentation process by several orders of magnitude while allowing for a repeatable, accurate, and condition-independent solution capable of precise collision-free operation at great heights. The proposed multi-robot approach allows for performing tasks requiring dynamic scene illumination in large-scale real-world scenarios, a process previously applicable only in small-scale laboratory-like conditions. Extensive experimental analyses range from single-UAV imaging to specialized lighting techniques requiring accurate coordination of multiple UAVs. The system�s robustness is demonstrated in more than two hundred autonomous flights in fifteen historical monuments requiring superior safety while lacking access to external localization. This unique experimental campaign, cooperated with restorers and conservators, brought numerous lessons transferable to other safety-critical robotic missions in documentation and inspection tasks.
|
| |
| 08:42-08:48, Paper MoAT11.3 | Add to My Program |
| Tight Collision Probability for UAV Motion Planning in Uncertain Environment |
|
| Liu, Tianyu | The University of Hong Kong |
| Zhang, Fu | University of Hong Kong |
| Gao, Fei | Zhejiang University |
| Pan, Jia | University of Hong Kong |
Keywords: Aerial Systems: Applications, Motion and Path Planning, Collision Avoidance
Abstract: Operating unmanned aerial vehicles (UAVs) in complex environments that feature dynamic obstacles and external disturbances poses significant challenges, primarily due to the inherent uncertainty in such scenarios. Additionally, inaccurate robot localization and modeling errors further exacerbate these challenges. Recent research on UAV motion planning in static environments has been unable to cope with the rapidly changing surroundings, resulting in trajectories that may not be feasible. Moreover, previous approaches that have addressed dynamic obstacles or external disturbances in isolation are insufficient to handle the complexities of such environments. This paper proposes a reliable motion planning framework for UAVs, integrating various uncertainties into a chance constraint that characterizes the uncertainty in a probabilistic manner. The chance constraint provides a probabilistic safety certificate by calculating the collision probability between the robot's Gaussian-distributed forward reachable set and states of obstacles. To reduce the conservatism of the planned trajectory, we propose a tight upper bound of the collision probability and evaluate it both exactly and approximately. The approximated solution is used to generate motion primitives as a reference trajectory, while the exact solution is leveraged to iteratively optimize the trajectory for better results. Our method is thoroughly tested in simulation and real-world experiments, verifying its reliability and effectiveness in uncertain environments.
|
| |
| 08:48-08:54, Paper MoAT11.4 | Add to My Program |
| Dodging Like a Bird: An Inverted Dive Maneuver Taking by Lifting-Wing Multicopters |
|
| Gao, Wenhan | Beihang University |
| Wang, Shuai | Beihang University |
| Quan, Quan | Beihang University |
Keywords: Aerial Systems: Applications, Motion and Path Planning
Abstract: It is crucial for hybrid unmanned aerial vehicles, such as lifting-wing multicopters, to plan a continuous, smooth, and collision-free trajectory to avoid obstacles. Unlike quadcopters, which typically work in indoor environments, lifting-wing multicopters typically fly at a high altitude with a high cruising speed, requiring higher maneuverability in the vertical direction. Inspired by birds, lifting-wing multicopters can take an inverted flight maneuver to gain more maneuverability than the corresponding multicopter owing to the additional lifting wing. In this paper, a rotation-aware collision-free motion planning strategy is proposed that takes aerodynamics into consideration and allows lifting-wing multicopters to fly at large rotation angles, even in inverted postures. Specifically, a collision-free state sequence is found using rotation-aware primitives by solving a graph search problem. The sequence is then refined with B-spline into smooth trajectories to be tracked by the differential flatness-based controller for lifting-wing multicopters. We analyze the proposed motion planning algorithm in different scenarios and demonstrate the feasibility of the generated trajectories in simulation and real-world experiments.
|
| |
| 08:54-09:00, Paper MoAT11.5 | Add to My Program |
| Model-Based Planning and Control for Terrestrial-Aerial Bimodal Vehicles with Passive Wheels |
|
| Zhang, Ruibin | Zhejiang University |
| Lin, Junxiao | Zhejiang University |
| Wu, Yuze | Zhejiang University |
| Gao, Yuman | Zhejiang University |
| Wang, Chi | Zhejiang University |
| Xu, Chao | Zhejiang University |
| Cao, Yanjun | Zhejiang University, Huzhou Institute of Zhejiang University |
| Gao, Fei | Zhejiang University |
Keywords: Aerial Systems: Applications, Motion and Path Planning, Motion Control
Abstract: Terrestrial and aerial bimodal vehicles have gained widespread attention due to their cross-domain maneuverability. Nevertheless, their bimodal dynamics significantly increase the complexity of motion planning and control, thus hindering robust and efficient autonomous navigation in unknown environments. To resolve this issue, we develop a model-based planning and control framework for terrestrial aerial bimodal vehicles. This work begins by deriving a unified dynamic model and the corresponding differential flatness. Leveraging differential flatness, an optimization-based trajectory planner is proposed, which takes into account both solution quality and computational efficiency. Moreover, we design a tracking controller using nonlinear model predictive control based on the proposed unified dynamic model to achieve accurate trajectory tracking and smooth mode transition. We validate our framework through extensive benchmark comparisons and experiments, demonstrating its effectiveness in terms of planning quality and control performance.
|
| |
| 09:00-09:06, Paper MoAT11.6 | Add to My Program |
| Polynomial-Based Online Planning for Autonomous Drone Racing in Dynamic Environments |
|
| Wang, Qianhao | Zhejiang University |
| Wang, Dong | Zhejiang University |
| Xu, Chao | Zhejiang University |
| Gao, Alan | Fan'gang |
| Gao, Fei | Zhejiang University |
Keywords: Aerial Systems: Applications, Motion and Path Planning, Task and Motion Planning
Abstract: In recent years, there is a noteworthy advancement in autonomous drone racing. However, the primary focus is on attaining execution times, while scant attention is given to the challenges of dynamic environments. The high-speed nature of racing scenarios, coupled with the potential for unforeseeable environmental alterations, present stringent requirements for online replanning and its timeliness. For racing in dynamic environments, we propose an online replanning framework with an efficient polynomial trajectory representation. We trade off between aggressive speed and flexible obstacle avoidance based on an optimization approach. Additionally, to ensure safety and precision when crossing intermediate racing waypoints, we formulate the demand as hard constraints during planning. For dynamic obstacles, parallel multi-topology trajectory planning is designed based on engineering considerations to prevent racing time loss due to local optimums. The framework is integrated into a quadrotor system and successfully demonstrated at the DJI Robomaster Intelligent UAV Championship, where it successfully complete the racing track and placed first, finishing in less than half the time of the second-place.
|
| |
| 09:06-09:12, Paper MoAT11.7 | Add to My Program |
| Autonomous Power Line Inspection with Drones Via Perception-Aware MPC |
|
| Xing, Jiaxu | ETH Zurich |
| Cioffi, Giovanni | University of Zurich |
| Hidalgo Carrio, Javier | University of Zurich and ETH Zurich |
| Scaramuzza, Davide | University of Zurich |
Keywords: Aerial Systems: Applications, Aerial Systems: Perception and Autonomy
Abstract: Drones have the potential to revolutionize power line inspection by increasing productivity, reducing inspection time, improving data quality, and eliminating the risks for human operators. Current state-of-the-art systems for power line inspection have two shortcomings: (i) control is decoupled from perception and needs accurate information about the location of the power lines and masts; (ii) obstacle avoidance is decoupled from the power line tracking, which results in poor tracking in the vicinity of the power masts, and, consequently, in decreased data quality for visual inspection. In this work, we propose a model predictive controller (MPC) that overcomes these limitations by tightly coupling perception and action. Our controller generates commands that maximize the visibility of the power lines while, at the same time, safely avoiding the power masts. For power line detection, we propose a lightweight learning-based detector that is trained only on synthetic data and is able to transfer zero-shot to real-world power line images. We validate our system in simulation and real-world experiments on a mock-up power line infrastructure. We release our code and datasets to the public.
|
| |
| 09:12-09:18, Paper MoAT11.8 | Add to My Program |
| A Perching and Tilting Aerial Robot for Precise and Versatile Power Tool Work on Vertical Walls |
|
| Dautzenberg, Roman | ETH Z�rich |
| K�ster, Timo | ETH Z�rich |
| Mathis, Timon | ETH Z�rich |
| Roth, Yann | ETH Z�rich |
| Steinauer, Curdin | ETH Z�rich |
| K�ppeli, Gabriel | ETH Z�rich |
| Santen, Julian | ETH Z�rich |
| Arranhado, Alina | ETH Z�rich |
| Biffar, Friederike | ETH Z�rich |
| K�tter, Till | ETH Z�rich |
| Lanegger, Christian | ETH Zurich |
| Allenspach, Mike | ETH Z�rich |
| Siegwart, Roland | ETH Zurich |
| B�hnemann, Rik | ETH Z�rich |
Keywords: Aerial Systems: Applications, Robotics and Automation in Construction, Actuation and Joint Mechanisms
Abstract: Drilling, grinding, and setting anchors on vertical walls are fundamental processes in everyday construction work. Manually doing these works is error-prone, potentially dangerous, and elaborate at height. Today, heavy mobile ground robots can perform automatic power tool work. However, aerial vehicles could be deployed in untraversable environments and reach inaccessible places. Existing drone designs do not provide the large forces, payload, and high precision required for using power tools. This work presents the first aerial robot design to perform versatile manipulation tasks on vertical concrete walls with continuous forces of up to 150 N. The platform combines a quadrotor with active suction cups for perching on walls and a lightweight, tiltable linear tool table. This combination minimizes weight using the propulsion system for flying, surface alignment, and feed during manipulation and allows precise positioning of the power tool. We evaluate our design in a concrete drilling application - a challenging construction process that requires high forces, accuracy, and precision. In 30 trials, our design can accurately pinpoint a target position despite perching imprecision. Nine visually guided drilling experiments demonstrate a drilling precision of 6 mm without further automation. Aside from drilling, we also demonstrate the versatility of the design by setting an anchor into concrete.
|
| |
| 09:18-09:24, Paper MoAT11.9 | Add to My Program |
| Resource-Constrained Station-Keeping for Latex Balloons Using Reinforcement Learning |
|
| Saunders, Jack | University of Bath |
| Prenevost, Lo�c | Lux Aerobot |
| Şimşek, �zg�r | University of Bath |
| Hunter, Alan Joseph | University of Bath |
| Li, Wenbin | University of Bath |
Keywords: Aerial Systems: Applications, Machine Learning for Robot Control, Reinforcement Learning
Abstract: High altitude balloons have proved useful for ecological aerial surveys, atmospheric monitoring, and communication relays. However, due to weight and power constraints, there is a need to investigate alternate modes of propulsion to navigate in the stratosphere. Very recently, reinforcement learning has been proposed as a control scheme to maintain balloons in the region of a fixed location, facilitated through diverse opposing wind-fields at different altitudes. Although air-pump based station keeping has been explored, there is no research on the control problem for venting and ballasting actuated balloons, which is commonly used as a low-cost alternative. We show how reinforcement learning can be used for this type of balloon. Specifically, we use the soft actor-critic algorithm, which on average is able to station-keep within 50 km for on average 25% of the flight, consistent with state-of-the-art. Furthermore, we show that the proposed controller effectively minimises the consumption of resources, thereby supporting long duration flights. We frame the controller as a continuous control reinforcement learning problem, which allows for a more diverse range of trajectories, as opposed to current state-of-the-art work, which uses discrete action spaces. Furthermore, through continuous control, we can make use of larger ascent rates which are not possible using air-pumps. The desired ascent-rate is decoupled into desired altitude and time-factor to provide a more transparent policy, compared to low-level control commands used in previous works. Finally, by applying the equations of motion, we establish appropriate thresholds for venting and ballasting to prevent the agent from exploiting the environment. More specifically, we ensure actions are physically feasible by enforcing constraints on venting and ballasting.
|
| |
| 09:24-09:30, Paper MoAT11.10 | Add to My Program |
| A Light-Weight, Low-Cost, and Sustainable Planning System for UAVs Using a Local Map Origin Update Approach |
|
| Lee, Dasol | Agency for Defense Development |
| La, Jinche | Agency for Defense Development |
| Joo, Sanghyun | Agency for Defense Development |
Keywords: Aerial Systems: Applications, Motion and Path Planning
Abstract: This paper proposes a sustainable planning system for small-sized unmanned aerial vehicles (UAVs). Our mapping module of the system uses a voxel array as data structure with an introduced feature which is local map origin update. This approach has clear advantages that the planning system can sustainably plan trajectories regardless of operating radius and flight distance, and it shows fastest invariant time complexity O(1) unlike other representation methods. Also, we propose an efficient configuration space (C-space) construction algorithm using incremental voxel inflation, and extend state-of-the-art Euclidean signed distance field (ESDF) algorithm, FIESTA by applying the local map origin update feature. The proposed planning system requires single depth camera only as a sensor, and can operate in real-time on embedded computing platforms. We have verified the planning system through real-world flight tests in dense environments using a light-weight quadrotor platform under 300 mm size equipped with low-cost components only.
|
| |
| 09:30-09:36, Paper MoAT11.11 | Add to My Program |
| Bubble Explorer: Fast UAV Exploration in Large-Scale and Cluttered 3D-Environments Using Occlusion-Free Spheres |
|
| Tang, Benxu | The University of Hong Kong |
| Ren, Yunfan | The University of Hong Kong |
| Zhu, Fangcheng | The University of Hong Kong |
| He, Rui | The University of Hong Kong |
| Liang, Siqi | Harbin Institute of Technology, Shenzhen |
| Kong, Fanze | The University of Hong Kong |
| Zhang, Fu | University of Hong Kong |
Keywords: Aerial Systems: Applications, Aerial Systems: Perception and Autonomy, Motion and Path Planning
Abstract: Autonomous exploration is a crucial aspect of robotics that has numerous applications. Most of the existing methods greedily choose goals that maximize immediate reward. This strategy is computationally efficient but insufficient for overall exploration efficiency. In recent years, some state-of-the-art methods are proposed, which generate a global coverage path and significantly improve overall exploration efficiency. However, global optimization produces high computational overhead, leading to low-frequency planner updates and inconsistent planning motion. In this work, we propose a novel method to support fast UAV exploration in large-scale and cluttered 3-D environments. We introduce a computationally lowcost viewpoints generation method using occlusion-free spheres. Additionally, we combine greedy strategy with global optimization, which considers both computational and exploration efficiency. We benchmark our method against state-of-the-art methods to showcase its superiority in terms of exploration efficiency and computational time. We conduct various real-world experiments to demonstrate the excellent performance of our method in large-scale and cluttered environments.
|
| |
| 09:36-09:42, Paper MoAT11.12 | Add to My Program |
| UPPLIED: UAV Path Planning for Inspection through Demonstration |
|
| Kannan, Shyam Sundar | Purdue University |
| Venkatesh, L.N Vishnunandan | Purdue University |
| Senthilkumaran, Revanth Krishna | Purdue University |
| Min, Byung-Cheol | Purdue University |
Keywords: Aerial Systems: Applications
Abstract: In this paper, a new demonstration-based path-planning framework for the visual inspection of large structures using UAVs is proposed. We introduce UPPLIED: UAV Path PLanning for InspEction through Demonstration, which utilizes a demonstrated trajectory to generate a new trajectory to inspect other structures of the same kind. The demonstrated trajectory can inspect specific regions of the structure and the new trajectory generated by UPPLIED inspects similar regions in the other structure. The proposed method generates inspection points from the demonstrated trajectory and uses standardization to translate those inspection points to inspect the new structure. Finally, the position of these inspection points is optimized to refine their view. Numerous experiments were conducted with various structures and the proposed framework was able to generate inspection trajectories of various kinds for different structures based on the demonstration. The trajectories generated match with the demonstrated trajectory in geometry and at the same time inspect the regions inspected by the demonstration trajectory with minimum deviation. The experimental video of the work can be found at https://youtu.be/YqPx-cLkv04.
|
| |
| 09:42-09:48, Paper MoAT11.13 | Add to My Program |
| Learning Fluid Flow Visualizations from In-Flight Images with Tufts |
|
| Lee, Jongseok | German Aerospace Center |
| Olsman, Jurrien | German Aerospace Center (DLR) |
| Triebel, Rudolph | German Aerospace Center (DLR) |
Keywords: Aerial Systems: Applications, Computer Vision for Automation, Object Detection, Segmentation and Categorization
Abstract: For better understanding of fluid flows around aerial systems, strips of wire or rope, widely known as tufts, are often used to visualize the local flow direction. This paper presents a computer vision system that automatically extracts the shape of tufts from images, which have been collected during real flights of a helicopter and an unmanned aerial vehicle (UAV). As images from these aerial systems present challenges to both the model-based computer vision and the end-to-end supervised deep learning techniques, we propose a semantic segmentation pipeline that consists of three uncertainty-based modules namely, (a) active learning for object detection, (b) label propagation for object classification, and (c) weakly supervised instance segmentation. Overall, these probabilistic approaches facilitate the learning process without requiring any manual annotations of semantic segmentation masks. Empirically, we motivate our design choices through comparative assessments and provide real world demonstrations of the proposed concept, for the first time to our knowledge. The project website found at https://sites.google.com/view/tuftrecognition.
|
| |
| 09:48-09:54, Paper MoAT11.14 | Add to My Program |
| Fully Autonomous Brick Pick-And-Place in Fields by Articulated Aerial Robot (I) |
|
| Anzai, Tomoki | The University of Tokyo |
| Zhao, Moju | The University of Tokyo |
| Nishio, Takuzumi | The University of Tokyo |
| Shi, Fan | ETH Z�rich |
| Okada, Kei | The University of Tokyo |
| Inaba, Masayuki | The University of Tokyo |
Keywords: Aerial Systems: Applications, Field Robots, Grasping
Abstract: Picking and Placing objects by aerial robot in the fields is an important and challenging task, which can significantly benefit not only the industry but also the rescue. General strategy depends on the magnetic force to pick object, which however lacks both generality and robustness. Therefore, we focus on the articulated structure to grasp bricking. Another issue to perform pick-and-place task in the fields is the autonomous recognition using onboard sensors. In this article, we present the achievement of fully autonomous pick-and-place by articulated aerial robot in a fully-autonomous manner. First, an articulated robot model with actively tiltable sensor is developed to guarantee the robustness in both state estimation and object detection. Second object detection methods are designed according to distance between robot and target object. Third, a comprehensive motion strategy is also developed to perform autonomous object searching, picking, and placing sequence. Particularly, a visual servoing method for robot position control is also proposed in this motion strategy to improve the robustness while approaching target. Finally, we present the experimental results of autonomous
|
| |
| MoAT12 Regular session, 252AB |
Add to My Program |
| Perception for Grasping and Manipulation I |
|
| |
| Chair: Ang Jr, Marcelo H | National University of Singapore |
| Co-Chair: D'Avella, Salvatore | Scuola Superiore Sant'Anna |
| |
| 08:30-08:36, Paper MoAT12.1 | Add to My Program |
| I2c-Net: Using Instance-Level Neural Networks for Monocular Category-Level 6D Pose Estimation |
|
| Remus, Alberto | Sant'Anna School of Advanced Studies |
| D'Avella, Salvatore | Scuola Superiore Sant'Anna |
| Di Felice, Francesco | Mechanical Intelligence Institute, Sant'Anna School of Advanced |
| Tripicchio, Paolo | Scuola Superiore Sant'Anna |
| Avizzano, Carlo Alberto | Scuola Superiore Sant'Anna |
Keywords: Perception for Grasping and Manipulation, Deep Learning for Visual Perception, RGB-D Perception
Abstract: Object detection and pose estimation are strict requirements for many robotic grasping and manipulation applications to endow robots with the ability to grasp objects with different properties in cluttered scenes and with various lighting conditions. This work proposes the framework i2c-net to extract the 6D pose of multiple objects belonging to different categories, starting from an instance-level pose estimation network and relying only on RGB images. The network is trained on a custom-made synthetic photo-realistic dataset, generated from some base CAD models, opportunely deformed and enriched with real textures for domain randomization purposes. At inference time, the instance-level network is employed in combination with a 3D mesh reconstruction module, achieving category-level capabilities. Depth information is used for postprocessing as correction. Tests conducted on real objects of the YCB-V and NOCS REAL datasets outline the high accuracy of the proposed approach.
|
| |
| 08:36-08:42, Paper MoAT12.2 | Add to My Program |
| Self-Supervised Instance Segmentation by Grasping |
|
| Liu, YuXuan | Covariant.ai, UC Berkeley |
| Chen, Xi | Embodied Intelligence, UC Berkeley |
| Abbeel, Pieter | UC Berkeley |
Keywords: Object Detection, Segmentation and Categorization, Deep Learning for Visual Perception, Perception for Grasping and Manipulation
Abstract: Instance segmentation is a fundamental skill for many robotic applications. We propose a self-supervised method that uses grasp interactions to collect segmentation supervision for an instance segmentation model. When a robot grasps an item, the mask of that grasped item can be inferred from the images of the scene before and after the grasp. Leveraging this insight, we learn a grasp segmentation model from a small dataset of labelled images to segment the grasped object from before and after grasp images. Such a model can segment grasped objects from thousands of grasp interactions without costly human annotation. Using the segmented grasped objects, we can "cut" objects from their original scenes and "paste" them into new scenes to generate instance supervision. We show that our grasp segmentation model provides a 5x error reduction when segmenting grasped objects compared with traditional image subtraction approaches. Combined with our "cut-and-paste" generation method, instance segmentation models trained with our method achieve better performance than a model trained with 10x the amount of labelled data. On a real robotic grasping system, our instance segmentation model reduces the rate of grasp errors by over 3x compared to an image subtraction baseline.
|
| |
| 08:42-08:48, Paper MoAT12.3 | Add to My Program |
| Fusing Visual Appearance and Geometry for Multi-Modality 6DoF Object Tracking |
|
| Stoiber, Manuel | German Aerospace Center (DLR) |
| Elsayed, Mariam | Technical University Munich |
| Reichert, Anne Elisabeth | German Aerospace Center |
| Steidle, Florian | German Aerospace Center |
| Lee, Dongheui | Technische Universit�t Wien (TU Wien) |
| Triebel, Rudolph | German Aerospace Center (DLR) |
Keywords: Visual Tracking, Perception for Grasping and Manipulation, RGB-D Perception
Abstract: In many applications of advanced robotic manipulation, six degrees of freedom (6DoF) object pose estimates are continuously required. In this work, we develop a multi-modality tracker that fuses information from visual appearance and geometry to estimate object poses. The algorithm extends our previous method ICG, which uses geometry, to additionally consider surface appearance. In general, object surfaces contain local characteristics from text, graphics, and patterns, as well as global differences from distinct materials and colors. To incorporate this visual information, two modalities are developed. For local characteristics, keypoint features are used to minimize distances between points from keyframes and the current image. For global differences, a novel region approach is developed that considers multiple regions on the object surface. In addition, it allows the modeling of external geometries. Experiments on the YCB-Video and OPT datasets demonstrate that our approach ICG+ performs best on both datasets, outperforming both conventional and deep learning-based methods. At the same time, the algorithm is highly efficient and runs at more than 300 Hz. The source code of our tracker is publicly available.
|
| |
| 08:48-08:54, Paper MoAT12.4 | Add to My Program |
| Viewpoint Push Planning for Mapping of Unknown Confined Spaces |
|
| Dengler, Nils | University of Bonn |
| Pan, Sicong | University of Bonn |
| Kalagaturu, Vamsi Krishna | Hochschule Bonn-Rhein-Sieg |
| Menon, Rohit | University of Bonn |
| Elnagdi, Murad | University of Bonn |
| Bennewitz, Maren | University of Bonn |
Keywords: Perception for Grasping and Manipulation
Abstract: Viewpoint planning is an important task in any application where objects or scenes need to be viewed from different angles to achieve sufficient coverage. The mapping of confined spaces such as shelves is an especially challenging task since objects occlude each other and the scene can only be observed from the front, posing limitations on the possible viewpoints. In this paper, we propose a deep reinforcement learning framework that generates promising views aiming at reducing the map entropy. Additionally, the pipeline extends standard viewpoint planning by predicting adequate minimally invasive push actions to uncover occluded objects and increase the visible space. Using a 2.5D occupancy height map as state representation that can be efficiently updated, our system decides whether to plan a new viewpoint or perform a push. To learn feasible pushes, we use a neural network to sample push candidates on the map based on training data provided by human experts. As simulated and real-world experimental results with a robotic arm show, our system is able to significantly increase the mapped space compared to different baselines, while the executed push actions highly benefit the viewpoint planner with only minor changes to the object configuration.
|
| |
| 08:54-09:00, Paper MoAT12.5 | Add to My Program |
| Depth-Based 6DoF Object Pose Estimation Using Swin Transformer |
|
| Li, Zhujun | The City University of New York |
| Stamos, Ioannis | City University of New York |
Keywords: Perception for Grasping and Manipulation, Deep Learning Methods, Object Detection, Segmentation and Categorization
Abstract: Accurately estimating the 6D pose of objects is crucial for many applications, such as robotic grasping, autonomous driving, and augmented reality. However, this task becomes more challenging in poor lighting conditions or when dealing with textureless objects. To address this issue, depth images are becoming an increasingly popular choice due to their invariance to a scene's appearance and the implicit incorporation of essential geometric characteristics. However, fully leveraging depth information to improve the performance of pose estimation remains a difficult and under-investigated problem. To tackle this challenge, we propose a novel framework called SwinDePose, that uses only geometric information from depth images to achieve accurate 6D pose estimation. SwinDePose first calculates the angles between each normal vector defined in a depth image and the three coordinate axes in the camera coordinate system. The resulting angles are then formed into an image, which is encoded using Swin Transformer. Additionally, we apply RandLA-Net to learn the representations from point clouds. The resulting image and point clouds embeddings are concatenated and fed into a semantic segmentation module and a 3D keypoints localization module. Finally, we estimate 6D poses using a least-square fitting approach based on the target object's predicted semantic mask and 3D keypoints. In experiments on the LineMod and Occlusion LineMod, SwinDePose outperforms existing state-of-the-art methods for 6D object pose estimation using depth images. We also provide competitive results on the YCB-Video dataset even without post-processing. This demonstrates the effectiveness of our approach and highlights its potential for improving performance in real-world scenarios. Our code is at https://github.com/zhujunli1993/SwinDePose.
|
| |
| 09:00-09:06, Paper MoAT12.6 | Add to My Program |
| DR-Pose: A Two-Stage Deformation-And-Registration Pipeline for Category-Level 6D Object Pose Estimation |
|
| Zhou, Lei | National University of Singapore |
| Liu, Zhiyang | National University of Singapore |
| Gan, Runze | National University of Singapore |
| Wang, Haozhe | National University of Singapore |
| Ang Jr, Marcelo H | National University of Singapore |
Keywords: Perception for Grasping and Manipulation, Deep Learning for Visual Perception
Abstract: Category-level object pose estimation involves estimating the 6D pose and the 3D metric size of objects from predetermined categories. While recent approaches take categorical shape prior information as reference to improve pose estimation accuracy, the single-stage network design and training manner lead to sub-optimal performance since there are two distinct tasks in the pipeline. In this paper, the advantage of two- stage pipeline over single-stage design is discussed. To this end, we propose a two-stage deformation-and-registration pipeline called DR-Pose, which consists of completion-aided deformation stage and scaled registration stage. The first stage uses a point cloud completion method to generate unseen parts of target object, guiding subsequent deformation on the shape prior. In the second stage, a novel registration network is designed to extract pose-sensitive features and predict the representation of object partial point cloud in canonical space based on the deformation results from the first stage. DR-Pose produces superior results to the state-of-the-art shape prior-based methods on both CAMERA25 and REAL275 benchmarks. Codes are available at https://github.com/Zray26/DR-Pose.git.
|
| |
| 09:06-09:12, Paper MoAT12.7 | Add to My Program |
| Learning from Pixels with Expert Observations |
|
| Hoang, Minh-Huy | University of Science, Ho Chi Minh City, Vietnam |
| Dinh, Long | Hanoi University of Science & Technology |
| Hai, Nguyen | Northeastern University |
Keywords: Reinforcement Learning, Learning from Demonstration, Deep Learning in Grasping and Manipulation
Abstract: In reinforcement learning (RL), sparse rewards can present a significant challenge. Fortunately, expert actions can be utilized to overcome this issue. However, acquiring explicit expert actions can be costly, and expert observations are often more readily available. This paper presents a new approach that uses expert observations for learning in robot manipulation tasks with sparse rewards from pixel observations. Specifically, our technique involves using expert observations as intermediate visual goals for a goal-conditioned RL agent, enabling it to complete a task by successively reaching a series of goals. We demonstrate the efficacy of our method in five challenging block construction tasks in simulation and show that when combined with two state-of-the-art agents, our approach can significantly improve their performance while requiring 4-20 times fewer expert actions during training. Moreover, our method is also superior to a hierarchical baseline.
|
| |
| 09:12-09:18, Paper MoAT12.8 | Add to My Program |
| RMBench: Benchmarking Deep Reinforcement Learning for Robotic Manipulator Control |
|
| Xiang, Yanfei | Tsinghua University |
| Wang, Xin | University at Buffalo |
| Hu, Shu | Carnegie Mellon University |
| Zhu, Bin Benjamin | Microsoft Research Asia |
| Huang, Xiaomeng | Tsinghua University |
| Wu, Xi | Chengdu University of Information Technology |
| Lyu, Siwei | University at Buffalo |
Keywords: Reinforcement Learning, Performance Evaluation and Benchmarking
Abstract: Reinforcement learning is used to tackle complex tasks with high-dimensional sensory inputs. Over the past decade, a wide range of reinforcement learning algorithms have been developed, with recent progress benefiting from deep learning for raw sensory signal representation. This raises a natural question: how well do these algorithms perform across different robotic manipulation tasks? To objectively compare algorithms, benchmarks use performance metrics. Benchmarks use objective performance metrics to offer a scientific way to compare algorithms. In this paper, we introduce RMBench, the first benchmark for robotic manipulations with high-dimensional continuous action and state spaces. We implement and evaluate reinforcement learning algorithms that take observed pixels as inputs and report their average performance and learning curves to demonstrate their performance and training stability. Our study concludes that none of the evaluated algorithms can handle all tasks well, with soft Actor-Critic outperforming most algorithms in terms of average reward and stability, and an algorithm combined with data augmentation potentially facilitating learning policies. Our code is publicly available at https://github.com/xiangyanfei212/RMBench-2022.git, including all benchmark tasks and studied algorithms.
|
| |
| 09:18-09:24, Paper MoAT12.9 | Add to My Program |
| Shape Completion with Prediction of Uncertain Regions |
|
| Humt, Matthias | German Aerospace Center (DLR), Technical University Munich (TUM) |
| Winkelbauer, Dominik | DLR |
| Hillenbrand, Ulrich | German Aerospace Center (DLR) |
Keywords: Perception for Grasping and Manipulation, RGB-D Perception
Abstract: Shape completion, i.e., predicting the complete geometry of an object from a partial observation, is highly relevant for several downstream tasks, most notably robotic manipulation. When basing planning or prediction of real grasps on object shape reconstruction, an indication of severe geometric uncertainty is indispensable. In particular, there can be an irreducible uncertainty in extended regions about the presence of entire object parts when given ambiguous object views. To treat this important case, we propose two novel methods for predicting such uncertain regions as straightforward extensions of any method for predicting local spatial occupancy, one through postprocessing occupancy scores, the other through direct prediction of an uncertainty indicator. We compare these methods together with two known approaches to probabilistic shape completion. Moreover, we generate a dataset, derived from ShapeNet [1], of realistically rendered depth images of object views with ground-truth annotations for the uncertain regions. We train on this dataset and test each method in shape completion and prediction of uncertain regions for known and novel object instances and on synthetic and real data. While direct uncertainty prediction is by far the most accurate in the segmentation of uncertain regions, both novel methods outperform the two baselines in shape completion and uncertain region prediction, and avoiding the predicted uncertain regions increases the quality of grasps for all tested methods. Web: https://github.com/DLR-RM/shape-completion
|
| |
| 09:24-09:30, Paper MoAT12.10 | Add to My Program |
| Structure from Action: Learning Interactions for 3D Articulated Object Structure Discovery |
|
| Nie, Neil | Columbia University |
| Gadre, Samir Yitzhak | Columbia University |
| Ehsani, Kiana | Allen Institute for Artificial Intelligence |
| Song, Shuran | Columbia University |
Keywords: Object Detection, Segmentation and Categorization, Perception-Action Coupling, Deep Learning for Visual Perception
Abstract: We introduce Structure from Action (SfA), a framework to discover 3D part geometry and joint parameters of unseen articulated objects via a sequence of inferred interactions. Our key insight is that 3D interaction and perception should be considered in conjunction to construct 3D articulated CAD models, especially for categories not seen during training. By selecting informative interactions, SfA discovers parts and reveals occluded surfaces, like the inside of a closed drawer. By aggregating visual observations in 3D, SfA accurately segments multiple parts, reconstructs part geometry, and infers all joint parameters in a canonical coordinate frame. Our experiments demonstrate that a SfA model trained in simulation can generalize to many unseen object categories with diverse structures and to real-world objects. Empirically, SfA outperforms a pipeline of state-of-the-art components by 25.4 3D IoU percentage points on unseen categories, while matching already performant joint estimation baselines.
|
| |
| 09:30-09:36, Paper MoAT12.11 | Add to My Program |
| Object-Oriented Option Framework for Robotics Manipulation in Clutter |
|
| Pang, Jing-Cheng | Nanjing University |
| Young, Stalin | Nanjing University |
| Xiong-Hui, Chen | National Key Laboratory for Novel Software Technology, Nanjing U |
| Yang, Xinyu | Nanjing University |
| Yang, Yu | National Key Laboratory for Novel Software Technology, Nanjing U |
| Mas, Ma | CloudMinds Robotics |
| Ziqi, Guo | CloudMinds Robotics |
| Yang, Howard | CloundMinds |
| Huang, Bill | CloudMinds Technologies Inc |
Keywords: Reinforcement Learning, Deep Learning in Grasping and Manipulation
Abstract: Domestic service robots are becoming increasingly popular due to their ability to help people with household tasks. These robots often encounter the challenge of manipulating objects in cluttered environments (MoC), which is difficult due to the complexity of effective planning and control. Previous solutions involved designing specific action primitives and planning paradigms. However, the pre-coded action primitives can limit the agility and task-solving scope of robots. In this paper, we propose a general approach for MoC called the Object-Oriented Option Framework (O3F), which uses the option framework (OF) to learn planning and control. The standard OF discovers options from scratch based on reinforcement learning, which can lead to collapsed options and hurt learning. To address this limitation, O3F introduces the concept of an object-oriented option space for OF, which focuses specifically on object movement and overcomes the challenges associated with collapsed options. Based on this, we train an object-oriented option planner to determine the option to execute and a universal object-oriented option executor to complete the option. Simulation experiments on the Ginger XR1 robot and robot arm show that O3F is generally applicable to various types of robot and manipulation tasks. Furthermore, O3F achieves success rates of 72.4% and 90% in grasping and object collecting tasks, respectively, significantly outperforming baseline methods.
|
| |
| 09:36-09:42, Paper MoAT12.12 | Add to My Program |
| Non-Contact Tactile Perception for Hybrid-Active Gripper |
|
| Pereira, Jonathas Henrique Mariano | IFSP - Institute Technology of Sao Paulo, Campus Registro |
| Joventino, Carlos Fernando | IFSP - Institute Technology of Sao Paulo, Campus Registro |
| Fabro, Jo�o Alberto | Federal University of Technology - Parana (UTFPR) |
| de Oliveira, Andre Schneider | Federal University of Technology - Parana |
Keywords: Object Detection, Segmentation and Categorization, Perception for Grasping and Manipulation, Manipulation Planning
Abstract: This paper presents a novel approach to object recognition using a reconfigurable gripper with multiple time-of-flight (ToF) sensors attached to the fingers and palm, introducing the concept of noncontact tactile perception. This approach aims to promote aproprioceptive sense in the gripper workspace, allowing object prediction in manipulation tasks. The Hybrid-Active (H-A) gripper can adapt its topology to achieve different object reading points to generate a reliable object estimation. Non-contact tactile perception uses ToF sensors and gripper reconfiguration degrees-of-freedom for 3D perception and surface estimation of the pick-up object. This method is based on five ToF sensors in the palm that identify the distance and adjust the gripper to the center of the object through its capability to manage the manipulator. The H-A gripper also has twelve sensors distributed over its three fingers: four sensors on each finger, two on the distal phalanx, and two on the middle phalanx. Fingers have a rotational mobility of 180�, allowing the sensing of all faces of the object at different angles to the tridimensional reconstruction. The proposed approach was evaluated in four experiments that analyzed the influence of resolution, object complexity, finger tilt, and angular sampling over 13 objects with different complexities. The experimentation set allows the overall evaluation of non-contact tactile perception and the specification of its performance parameters.
|
| |
| MoAT13 Regular session, 260 Portside Ballroom |
Add to My Program |
| Visual Learning |
|
| |
| Chair: Wang, Yu-Xiong | University of Illinois Urbana-Champaign |
| Co-Chair: Watanabe, Tetsuyou | Kanazawa University |
| |
| 08:30-08:36, Paper MoAT13.1 | Add to My Program |
| ILabel: Revealing Objects in Neural Fields |
|
| Zhi, Shuaifeng | National University of Defense Technology |
| Sucar, Edgar | Imperial College London |
| Mouton, Andre | Dyson Ltd |
| Haughton, Iain | Dyson Ltd |
| Laidlow, Tristan | Boston Dynamics |
| Davison, Andrew J | Imperial College London |
Keywords: Semantic Scene Understanding, Deep Learning for Visual Perception, Representation Learning
Abstract: A neural field trained with self-supervision to efficiently represent the geometry and colour of a 3D scene tends to automatically decompose it into coherent and accurate object-like regions, which can be revealed with sparse labelling interactions to produce a 3D semantic scene segmentation. Our real-time iLabel system takes input from a hand-held RGB-D camera, requires zero prior training data, and works in an `open set' manner, with semantic classes defined on the fly by the user. iLabel's underlying model is a simple multilayer perceptron (MLP), trained from scratch to learn a neural representation of a single 3D scene. The model is updated continually and visualised in real-time, allowing the user to focus interactions to achieve extremely efficient semantic segmentation. A room-scale scene can be accurately labelled into 10+ semantic categories with around 100 clicks, taking less than 5 minutes. Quantitative labelling accuracy scales powerfully with the number of clicks, and rapidly surpasses standard pre-trained semantic segmentation methods. We also demonstrate a hierarchical labelling variant of iLabel and a `hands-free' mode where the user only needs to supply label names for automatically-generated locations.
|
| |
| 08:36-08:42, Paper MoAT13.2 | Add to My Program |
| Weakly Supervised Referring Expression Grounding via Dynamic Self-Knowledge Distillation |
|
| Mi, Jinpeng | USST |
| Chen, Zhiqian | University of Shanghai for Science and Technology |
| Zhang, Jianwei | University of Hamburg |
Keywords: Visual Learning, Deep Learning for Visual Perception
Abstract: Weakly supervised referring expression grounding (WREG) is an attractive and challenging task for grounding target regions in images by understanding given referring expressions. WREG learns to ground target objects without the manual annotations between image regions and referring expressions during the model training phase. Different from the predominant grounding pattern of existing models, which locates target objects by reconstructing the region-expression correspondence, we investigate WREG from a novel perspective and enrich the prevailing pattern with self-knowledge distillation. Specifically, we propose a target-guided self-knowledge distillation approach that adopts the target prediction knowledge learned from the previous training iterations as the teacher to guide the subsequent training procedure. In order to avoid the misleading caused by the teacher knowledge with low prediction confidence, we present an uncertainty-aware knowledge refinement strategy to adaptively rectify the teacher knowledge by learning dynamic threshold values based on the model prediction uncertainty. To validate the proposed approach, we implement extensive experiments on three benchmark datasets, i.e., RefCOCO, RefCOCO+, and RefCOCOg. Our approach achieves new state-of-the-art results on several splits of the benchmark datasets, showcasing the advantage of the proposed framework for WREG. The implementation codes and trained models are available at: https://github.com/dami23/WREG_Self_KD.
|
| |
| 08:42-08:48, Paper MoAT13.3 | Add to My Program |
| EventTransAct: A Video Transformer-Based Framework for Event-Camera Based Action Recognition |
|
| de Blegiers, Tristan | University of Central Florida |
| Dave, Ishan Rajendrakumar | University of Central Florida |
| Yousaf, Adeel | University of Central Florida |
| Shah, Mubarak | University of Central Florida |
Keywords: Gesture, Posture and Facial Expressions, Visual Learning, Computer Vision for Automation
Abstract: Recognizing and comprehending human actions and gestures is a crucial perception requirement for robots to interact with humans and carry out tasks in diverse domains, including service robotics, healthcare, and manufacturing. Event cameras, with their ability to capture fast-moving objects at a high temporal resolution, offer new opportunities compared to standard action recognition in RGB videos. However, previous research on event camera action recognition has primarily focused on sensor-specific network architectures and image encoding, which may not be suitable for new sensors and limit the use of recent advancement in transformer-based architectures. In this study, we employ using a computationally efficient model, namely the video transformer network (VTN), which initially acquires spatial embeddings per event-frame and then utilizes a temporal self-attention mechanism. This approach separates the spatial and temporal operations, resulting in VTN being more computationally efficient than other video transformers that process spatio-temporal volumes directly. In order to better adopt the VTN for the sparse and finegrained nature of event data, we design Event-Contrastive Loss (mathcal{L}_{EC}) and event specific augmentations. Proposed mathcal{L}_{EC} promotes learning fine-grained spatial cues in the spatial backbone of VTN by contrasting temporally misaligned frames. We evaluate our method on real-world action recognition of N-EPIC Kitchens dataset, and achieve state-of-the-art results on both protocols - testing in seen kitchen (textbf{74.9%} accuracy) and testing in unseen kitchens (textbf{42.43% and 46.66% Accuracy}). Our approach also takes less computation time compared to competitive prior approaches. We also evaluate our method on the standard DVS Gesture recognition dataset, achieving a competitive accuracy of textbf{97.9%} compared to prior work that uses dedicated architectures and image-encoding for the DVS dataset. These results demonstrate the potential of our framework textit{EventTransAct} for real-world applications of event-camera based action recognition. Project Page: url{https://tristandb8.github.io/EventTransAct_webpage/}
|
| |
| 08:48-08:54, Paper MoAT13.4 | Add to My Program |
| Virtual Ski Training System That Allows Beginners to Acquire Ski Skills Based on Physical and Visual Feedbacks |
|
| Okada, Yushi | Waseda University |
| Seo, Chanjin | Waseda University |
| Miyakawa, Shunichi | Waseda University |
| Taniguchi, Motofumi | Waseda University |
| Kanosue, Kazuyuki | Waseda University |
| Ogata, Hiroyuki | Seikei University |
| Ohya, Jun | Waseda University |
Keywords: Virtual Reality and Interfaces, Visual Learning, Sensorimotor Learning
Abstract: This paper proposes a ski training system using VR (Virtual Reality) that enables beginners to acquire skiing skills without going to an actual ski ground. The proposed system obtains the speed of skiing based on the center of pressure (COP) of each player's foot. The first-person perspective of skiing at the obtained speed down a ski slope is fed back to the player as a VR image. Experiments were conducted to evaluate the effectiveness of the proposed system and the VR interface. Specifically, beginner skiers were categorized into three groups: "a group trained with the proposed VR system", "a group trained with a system that provides feedback of the skiing speed calculated from the COP by increasing or decreasing the gauge (a bar-shaped graph representing changes in numerical values), instead of VR", and "a group that does not train with the system". After training under each of these conditions, a sliding test was conducted on an actual ski slope to check the degree of skill acquisition. The results show that subjects trained with the proposed system acquired more skiing skills than subjects who did not use the system on actual ski slopes. Furthermore, there was no clear difference in the result of the sliding test between subjects trained by the VR interface and those trained by the gauge interface, but the VR interface yields better deceleration postures.
|
| |
| 08:54-09:00, Paper MoAT13.5 | Add to My Program |
| Attention-Based VR Facial Animation with Visual Mouth Camera Guidance for Immersive Telepresence Avatars |
|
| Rochow, Andre | University of Bonn |
| Schwarz, Max | University Bonn |
| Behnke, Sven | University of Bonn |
Keywords: Gesture, Posture and Facial Expressions, Visual Learning, Human-Robot Collaboration
Abstract: Facial animation in virtual reality environments is essential for applications that necessitate clear visibility of the user�s face and the ability to convey emotional signals. In our scenario, we animate the face of an operator who controls a robotic Avatar system. The use of facial animation is particularly valuable when the perception of interacting with a specific individual, rather than just a robot, is intended. Purely keypoint-driven animation approaches struggle with the complexity of facial movements. We present a hybrid method that uses both keypoints and direct visual guidance from a mouth camera. Our method generalizes to unseen operators and requires only a quick enrolment step with capture of two short videos. Multiple source images are selected with the intention to cover different facial expressions. Given a mouth camera frame from the HMD, we dynamically construct the target keypoints and apply an attention mechanism to determine the importance of each source image. To resolve keypoint ambiguities and animate a broader range of mouth expressions, we propose to inject visual mouth camera information into the latent space. We enable training on large-scale speaking head datasets by simulating the mouth camera input with its perspective differences and facial deformations. Our method outperforms a baseline in quality, capability, and temporal consistency. In addition, we highlight how the facial animation contributed to our victory at the ANA Avatar XPRIZE Finals.
|
| |
| 09:00-09:06, Paper MoAT13.6 | Add to My Program |
| Test-Time Adaptation for Point Cloud Upsampling Using Meta-Learning |
|
| Hatem, Ahmed | University of Manitoba |
| Qian, Yiming | University of Manitoba |
| Wang, Yang | Concordia University |
Keywords: Visual Learning, Deep Learning Methods, Transfer Learning
Abstract: Affordable 3D scanners often produce sparse and non-uniform point clouds that negatively impact downstream applications in robotic systems. While existing point cloud upsampling architectures have demonstrated promising results on standard benchmarks, they tend to experience significant performance drops when the test data have different distributions from the training data. To address this issue, this paper proposes a test-time adaption approach to enhance model generality of point cloud upsampling. The proposed approach leverages meta-learning to explicitly learn network parameters for test-time adaption. Our method does not require any prior information about the test data. During meta-training, the model parameters are learned from a collection of instance-level tasks, each of which consists of a sparse-dense pair of point clouds from the training data. During meta-testing, the trained model is fine-tuned with a few gradient updates to produce a unique set of network parameters for each test instance. The updated model is then used for the final prediction. Our framework is generic and can be applied in a plug-and-play manner with existing backbone networks in point cloud upsampling. Extensive experiments demonstrate that our approach improves the performance of state-of-the-art models.
|
| |
| 09:06-09:12, Paper MoAT13.7 | Add to My Program |
| Revisiting Event-Based Video Frame Interpolation |
|
| Chen, Jiaben | University of California, San Diego |
| Zhu, Yichen | Shanghaitech University |
| Lian, Dongze | National University of Singapore |
| Yang, Jiaqi | ShanghaiTech University |
| Wang, Yifu | ShanghaiTech University |
| Zhang, Renrui | Peking University |
| Liu, Xinhang | HKUST |
| Qian, Shenhan | Technical University of Munich |
| Kneip, Laurent | ShanghaiTech University |
| Gao, Shenghua | Shanghaitech University |
Keywords: Visual Learning, Sensor Fusion, Deep Learning for Visual Perception
Abstract: Dynamic vision sensors or event cameras provide rich complementary information for video frame interpolation. Existing state-of-the-art methods follow the paradigm of combining both synthesis-based and warping networks. However, few of those methods fully respect the intrinsic characteristics of events streams. Given that event cameras only encode intensity changes and polarity rather than color intensities, estimating optical flow from events is arguably more difficult than from RGB information. We therefore propose to incorporate RGB information in an event-guided optical flow refinement strategy. Moreover, in light of the quasi-continuous nature of the time signals provided by event cameras, we propose a divide-and-conquer strategy in which event-based intermediate frame synthesis happens incrementally in multiple simplified stages rather than in a single, long stage. Extensive experiments on both synthetic and real-world datasets show that these modifications lead to more reliable and realistic intermediate frame results than previous video frame interpolation methods. Our findings underline that a careful consideration of event characteristics such as high temporal density and elevated noise benefits interpolation accuracy.
|
| |
| 09:12-09:18, Paper MoAT13.8 | Add to My Program |
| Revisiting Deformable Convolution for Depth Completion |
|
| Sun, Xinglong | Stanford & UIUC |
| Ponce, Jean | Ecole Normale Sup�rieure |
| Wang, Yu-Xiong | University of Illinois Urbana-Champaign |
Keywords: RGB-D Perception, Visual Learning
Abstract: Depth completion, which aims to generate high-quality dense depth maps from sparse depth maps, has attracted increasing attention in recent years. Previous work usually employs RGB images as guidance, and introduces iterative spatial propagation to refine estimated coarse depth maps. However, most of the propagation refinement methods require several iterations and suffer from a fixed receptive field, which may contain irrelevant and useless information with very sparse input. In this paper, we address these two challenges simultaneously by revisiting the idea of deformable convolution. We propose an effective architecture that leverages deformable kernel convolution as a single-pass refinement module, and empirically demonstrate its superiority. To better understand the function of deformable convolution and exploit it for depth completion, we further systematically investigate a variety of representative strategies. Our study reveals that, different from prior work, deformable convolution needs to be applied on an estimated depth map with a relatively high density for better performance. We evaluate our model on the large-scale KITTI dataset and achieve state-of-the-art level performance in both accuracy and inference speed. Our code is available at https://github.com/AlexSunNik/ReDC.
|
| |
| 09:18-09:24, Paper MoAT13.9 | Add to My Program |
| Long-Distance Gesture Recognition Using Dynamic Neural Networks |
|
| Bhatnagar, Shubhang | University of Illinois at Urbana-Champaign |
| Gopal, Sharath | Bosch |
| Ahuja, Narendra | Univ. of Illinois |
| Ren, Liu | Robert Bosch North America Research Technology Center |
Keywords: Gesture, Posture and Facial Expressions, Visual Learning, Recognition
Abstract: Gestures form an important medium of communication between humans and machines. An overwhelming majority of existing gesture recognition methods are tailored to a scenario where humans and machines are located very close to each other. This short-distance assumption does not hold true for several types of interactions, for example gesture-based interactions with a floor cleaning robot or with a drone. Methods made for short-distance recognition are unable to perform well on long-distance recognition due to gestures occupying only a small portion of the input data. Their performance is especially worse in resource constrained settings where they are not able to effectively focus their limited compute on the gesturing subject. We propose a novel, accurate and efficient method for the recognition of gestures from longer distances. It uses a dynamic neural network to select features from gesture-containing spatial regions of the input sensor data for further processing. This helps the network focus on features important for gesture recognition while discarding background features early on, thus making it more compute efficient compared to other techniques. We demonstrate the performance of our method on the LD-ConGR long-distance dataset where it outperforms previous state-of-the-art methods on recognition accuracy and compute efficiency.
|
| |
| 09:24-09:30, Paper MoAT13.10 | Add to My Program |
| Neural Implicit Vision-Language Feature Fields |
|
| Blomqvist, Kenneth | ETH Zurich |
| Milano, Francesco | ETH Zurich |
| Chung, Jen Jen | The University of Queensland |
| Ott, Lionel | ETH Zurich |
| Siegwart, Roland | ETH Zurich |
Keywords: Semantic Scene Understanding, Visual Learning, Representation Learning
Abstract: Recently, groundbreaking results have been presented on open-vocabulary semantic image segmentation. Such methods segment each pixel in an image into arbitrary categories provided at run-time in the form of text prompts, as opposed to a fixed set of classes defined at training time. In this work, we present a method for volumetric open-vocabulary semantic scene segmentation. Our method builds on the insight that we can fuse 2D image features from a vision-language model into a neural implicit representation. We show that the resulting feature field can be segmented into different classes by assigning points to the closest natural language text prompt. Using an implicit volumetric representation enables us to segment the scene both in 3D and 2D by rendering feature maps from any given viewpoint of the scene. We show that our method works on noisy real-world data and can run in real-time on live sensor data dynamically adjusting to text prompts. We also present quantitative comparisons on the diverse ScanNet dataset.
|
| |
| 09:30-09:36, Paper MoAT13.11 | Add to My Program |
| Language Guided Robotic Grasping with Fine-Grained Instructions |
|
| Sun, Qiang | Fudan University |
| Lin, Haitao | Fudan University |
| Fu, Ying | Beijing Institute of Technology |
| Fu, Yanwei | Fudan University |
| Xue, Xiangyang | Fudan University |
Keywords: Visual Learning, Semantic Scene Understanding, Grasping
Abstract: Given a single RGB image and the attribute-rich language instructions, this paper investigates the novel problem of using Fine-grained instructions for the Language guided robotic Grasping FLarG. This problem is made challenging by learning fine-grained language descriptions to ground target objects. Recent advances have been made in visually grounding the objects simply by several coarse attributes. However, these methods have poor performance as they cannot well align the multi-modal features, and do not make the best of recent powerful large pre-trained vision and language models, e.g., Clip. To this end, this paper proposes a FLarG pipeline including stages of clip-guided object localization, and 6-DoF category-level object pose estimation for grasping. Specially, we first take the Clip-based segmentation model CRIS as the backbone and propose an end-to-end DyCRIS model that uses a novel dynamic mask strategy to well fuse the multi-level language and vision features. Then, the well-trained instance segmentation backbone Mask R-CNN is adopted to further improve the predicted mask of our DyCRIS. Finally, the target object pose is inferred for the robotics grasping by using the recent 6-DoF object pose estimation method. To validate our CLIP-enhanced pipeline, we also construct a validation dataset for our FLarG task and name it RefNOCS. Extensive results on RefNOCS have shown the utility and effectiveness of our proposed method. The project homepage is available at https://sunqiang85.github.io/FLarG.
|
| |
| 09:36-09:42, Paper MoAT13.12 | Add to My Program |
| Whole Shape Estimation of Transparent Object from Its Contour Using Statistical Shape Model |
|
| Okada, Kaihei | Kanazawa University |
| Kobayashi, Riku | Kanazawa University |
| Tsuji, Tokuo | Kanazawa University |
| Hiramitsu, Tatsuhiro | Kanazawa University |
| Seki, Hiroaki | Kanazawa University |
| Nishimura, Toshihiro | Kanazawa University |
| Suzuki, Yosuke | Kanazawa University |
| Watanabe, Tetsuyou | Kanazawa University |
Keywords: Computer Vision for Automation, Computer Vision for Manufacturing
Abstract: This paper presents a method for estimating the 3D shape of transparent objects from an RGB-D image using a statistical shape model. Statistical shape models compress dimensions from multiple shapes to represent variations in shape with fewer parameters. It is difficult to measure the depth of a transparent object with any sensor. Therefore, the statistical shape model is deformed to fit the contour extracted from the RGB image to estimate the shape of the object. The depth image is only used for detecting the plane on which transparent objects are placed. The proposed method estimates the whole shape of transparent objects, unlike other estimation methods. The estimation accuracy of the proposed method is compared with that of a machine learning based method. In addition, the estimated whole shape was compared with the measured data of a 3D scanner.
|
| |
| MoAT14 Regular session, 320 |
Add to My Program |
| Localization I |
|
| |
| Chair: Malis, Ezio | Inria |
| Co-Chair: Lopez, Brett | University of California, Los Angeles |
| |
| 08:30-08:36, Paper MoAT14.1 | Add to My Program |
| A Hierarchical Multi-Task Visual Relocalization System |
|
| Yin, Jiahao | Beihang University |
| Xiao, Huahui | BUAA |
| Li, Wei | Beihang University |
| Zhou, Xinyu | University of International Business and Economics |
| Liu, Zhili | Yihang Intellitech Co., Ltd |
| Li, Xue | Yihang Intellitech Co., Ltd |
| Fan, Shengyin | Yihang Intellitech Co., Ltd |
Keywords: Localization, SLAM, Autonomous Vehicle Navigation
Abstract: Locating the 6DoF pose of a camera in a known scene graph is a fundamental problem of SLAM. Hierarchical relocalization methods, which retrieve images first and match feature points later, have been widely studied by scholars for their high accuracy. In this paper, based on hierarchical relocalization, HAPOR (Hierarchical-features Aligned Projection Optimization for Relocalization), an end-to-end relocalization system, is proposed to combine image retrieval and iterative pose optimization. Through an attention mechanism branch, foreground dynamic objects and repeating textures are filtered out. We further design an image retrieval system (GTLGR) in HAPOR and generate an initial pose based on the co-visibility graph for subsequent iterative optimization. In addition, relying on GPS as ground truth for image retrieval training is quite inefficient, thus, we model the common visible area of two camera's view in 3D field, which significantly reduces the training time. Finally, we apply HAPOR to the ORB-SLAM2 system and obtain the state-of-the-art relocalization results. Here is a demo: https://www.youtube.com/watch?v=rCLpWCxN31M
|
| |
| 08:36-08:42, Paper MoAT14.2 | Add to My Program |
| RI-LIO: Reflectivity Image Assisted Tightly-Coupled LiDAR-Inertial Odometry |
|
| Zhang, Yanfeng | Institute of Automation, Chinese Academy of Sciences |
| Tian, Yunong | Institute of Automation, Chinese Academy of Sciences |
| Wang, Wanguo | State Grid Intelligence Technology Co., Ltd |
| Yang, Guodong | Institute of Automation, Chinese Academy of Sciences |
| Li, Zhishuo | Chinese Academy of Sciences |
| Jing, Fengshui | Institute of Automation, CAS |
| Tan, Min | Institute of Automation, Chinese Academy of Sciences |
Keywords: Localization, SLAM, Mapping
Abstract: In this letter, we propose RI-LIO, a new reflectivity image assisted tightly-coupled LiDAR-inertial odometry (LIO) framework that introduces additional reflectivity texture information to efficiently reduce the drift of geometric-only methods. To achieve this, we construct an iterated extended Kalman filter framework by blending the point-to-plane geometric measurement and the reflectivity image measurement. Specifically, the geometric measurement is defined as the distance from the raw point of a new scan to its nearest neighbor plane in the global incremental kd-tree map. The searched nearest neighbor point is used to render a sparse reflectivity image after LiDAR motion distortion information is given by its corresponding raw point. Then, the reflectivity measurement is built to align the sparse reflectivity image with the dense reflectivity image of the current scan by minimizing the photometric errors directly. In addition, based on the mechanism of high-resolution LiDAR, a corrected spherical projection model is proposed to project spatial points into the image frame. Finally, extensive experiments are conducted using different mobile robots in structured, unstructured and challenging open field scenarios. The results demonstrate that the proposed method outperforms existing geometric-only methods in terms of robustness and accuracy, especially in the rotation direction.
|
| |
| 08:42-08:48, Paper MoAT14.3 | Add to My Program |
| Off the Radar: Uncertainty-Aware Radar Place Recognition with Introspective Querying and Map Maintenance |
|
| Yuan, Jianhao | University of Oxford |
| Newman, Paul | Oxford University |
| Gadd, Matthew | University of Oxford |
Keywords: Localization, Mapping, Deep Learning Methods
Abstract: Localisation with Frequency-Modulated Continuous-Wave (FMCW) radar has gained increasing interest due to its inherent resistance to challenging environments. However, complex artefacts of the radar measurement process require appropriate uncertainty estimation � to ensure the safe and reliable application of this promising sensor modality. In this work, we propose a multi-session map management system which constructs the �best� maps for further localisation based on learned variance properties in an embedding space. Using the same variance properties, we also propose a new way to introspectively reject localisation queries that are likely to be incorrect. For this, we apply robust noise-aware metric learning, which both leverages the short-timescale variability of radar data along a driven path (for data augmentation) and predicts the downstream uncertainty in metric-space-based place recognition. We prove the effectiveness of our method over extensive cross-validated tests of the Oxford Radar RobotCar and MulRan dataset. In this, we outperform the current state-of-the-art in radar place recognition and other uncertainty-aware methods when using only single nearest-neighbour queries. We also show consistent performance increases when rejecting queries based on uncertainty over a difficult test environment, which we did not observe for a competing uncertainty-aware place recognition system.
|
| |
| 08:48-08:54, Paper MoAT14.4 | Add to My Program |
| Global Localization in Unstructured Environments Using Semantic Object Maps Built from Various Viewpoints |
|
| Ankenbauer, Jacqueline | Massachusetts Institute of Technology |
| Lusk, Parker C. | Massachusetts Institute of Technology |
| Thomas, Annika | Massachusetts Institute of Technology |
| How, Jonathan | Massachusetts Institute of Technology |
Keywords: Localization, Mapping, SLAM
Abstract: We present a novel framework for global localization and guided relocalization of a vehicle in an unstructured environment. Compared to existing methods, our pipeline does not rely on cues from urban fixtures (e.g., lane markings, buildings), nor does it make assumptions that require the vehicle to be navigating on a road network. Instead, we achieve localization in both urban and non-urban environments by robustly associating and registering the vehicle�s local semantic object map with a compact semantic reference map, potentially built from other viewpoints, time periods, and/or modalities. Robustness to noise, outliers, and missing objects is achieved through our graph-based data association algorithm. Further, the guided relocalization capability of our pipeline mitigates drift inherent in odometry-based localization after the initial global localization. We evaluate our pipeline on two publicly- available, real-world datasets to demonstrate its effectiveness at global localization in both non-urban and urban environments. The Katwijk Beach Planetary Rover dataset [1] is used to show our pipeline�s ability to perform accurate global localization in unstructured environments. Demonstrations on the KITTI dataset [2] achieve an average pose error of 3.8m across all 35 localization events on Sequence 00 when localizing in a reference map created from aerial images. Compared to existing works, our pipeline is more general because it can perform global localization in unstructured environments using maps built from different viewpoints.
|
| |
| 08:54-09:00, Paper MoAT14.5 | Add to My Program |
| Constructing Metric-Semantic Maps Using Floor Plan Priors for Long-Term Indoor Localization |
|
| Zimmerman, Nicky | University of Bonn |
| Sodano, Matteo | Photogrammetry and Robotics Lab, University of Bonn |
| Marks, Elias Ariel | University of Bonn |
| Behley, Jens | University of Bonn |
| Stachniss, Cyrill | University of Bonn |
Keywords: Localization, Mapping
Abstract: Object-based maps are relevant for scene understanding since they integrate geometric and semantic information of the environment, allowing autonomous robots to robustly localize and interact with on objects. In this paper, we address the task of constructing a metric-semantic map for the purpose of long-term object-based localization. We exploit 3D object detections from monocular RGB frames for both, the object-based map construction, and for globally localizing in the constructed map. To tailor the approach to a target environment, we propose an efficient way of generating 3D annotations to finetune the 3D object detection model. We evaluate our map construction in an office building, and test our long-term localization approach on challenging sequences recorded in the same environment over nine months. The experiments suggest that our approach is suitable for constructing metric-semantic maps, and that our localization approach is robust to long-term changes. Both, the mapping algorithm and the localization pipeline can run online on an onboard computer. We release an open-source C++/ROS implementation of our approach.
|
| |
| 09:00-09:06, Paper MoAT14.6 | Add to My Program |
| DisPlacing Objects: Improving Dynamic Vehicle Detection Via Visual Place Recognition under Adverse Conditions |
|
| Hausler, Stephen | CSIRO |
| Garg, Sourav | Queensland University of Technology |
| Chakravarty, Punarjay | Planet |
| Shrivastava, Shubham | Ford Greenfield Labs |
| Vora, Ankit | Ford Motor Company |
| Milford, Michael J | Queensland University of Technology |
Keywords: Autonomous Vehicle Navigation, Object Detection, Segmentation and Categorization, Localization
Abstract: Can knowing where you are assist in perceiving objects in your surroundings, especially under adverse weather and lighting conditions? In this work we investigate whether a prior map can be leveraged to aid in the detection of dynamic objects in a scene without the need for a 3D map or pixel-level map-query correspondences. We contribute an algorithm which refines an initial set of candidate object detections and produces a refined subset of highly accurate detections using a prior map. We begin by using visual place recognition (VPR) to retrieve a prior map image for a given query image, then use a binary classification neural network that compares the query and prior map image regions to validate the query detection. Once our classification network is trained, on approximately 1000 query-map image pairs, it is able to improve the performance of vehicle detection when combined with an existing off-the-shelf vehicle detector. We demonstrate our approach using standard datasets across two cities (Oxford and Zurich) under different settings of train-test separation of map-query traverse pairs. We further emphasize the performance gains of our approach against alternative design choices and show that VPR suffices for the task, eliminating the need for precise ground truth localization.
|
| |
| 09:06-09:12, Paper MoAT14.7 | Add to My Program |
| FM-Loc: Using Foundation Models for Improved Vision-Based Localization |
|
| Mirjalili, Reihaneh | University of Technology Nuremberg |
| Krawez, Michael | University of Technology Nuremberg |
| Burgard, Wolfram | University of Technology Nuremberg |
Keywords: Localization, SLAM, Vision-Based Navigation
Abstract: Visual place recognition is essential for vision-based robot localization and SLAM. Despite the tremendous progress made in recent years, place recognition in changing environments remains challenging. A promising approach to cope with appearance variations is to leverage high-level semantic features like objects or place categories. In this paper, we propose FM-Loc which is a novel image-based localization approach based on Foundation Models that uses the Large Language Model GPT-3 in combination with the Visual-Language Model CLIP to construct a semantic image descriptor that is robust to severe changes in scene geometry and camera viewpoint. We deploy CLIP to detect objects in an image, GPT-3 to suggest potential room labels based on the detected objects, and CLIP again to propose the most likely location label. The object labels and the scene label constitute an image descriptor that we use to calculate a similarity score between the query and database images. We validate our approach on real-world data that exhibit significant changes in camera viewpoints and object placement between the database and query trajectories. The experimental results demonstrate that our method is applicable to a wide range of indoor scenarios without the need for training or fine-tuning.
|
| |
| 09:12-09:18, Paper MoAT14.8 | Add to My Program |
| Joint On-Manifold Gravity and Accelerometer Intrinsics Estimation for Inertially Aligned Mapping |
|
| Nemiroff, Ryan | University of California, Los Angeles |
| Chen, Kenny | University of California, Los Angeles |
| Lopez, Brett | University of California, Los Angeles |
Keywords: Localization, Mapping, SLAM
Abstract: Aligning a robot's trajectory or map to the inertial frame is a critical capability that is often difficult to do accurately even though inertial measurement units (IMUs) can observe absolute roll and pitch with respect to gravity. Accelerometer biases and scale factor errors from the IMU's initial calibration are often the major source of inaccuracies when aligning the robot's odometry frame with the inertial frame, especially for low-grade IMUs. Practically, one would simultaneously estimate the true gravity vector, accelerometer biases, and scale factor to improve measurement quality but these quantities are not observable unless the IMU is sufficiently excited. While several methods estimate accelerometer bias and gravity, they do not explicitly address the observability issue nor do they estimate scale factor. We present a fixed-lag factor-graph-based estimator to address both of these issues. In addition to estimating accelerometer scale factor, our method mitigates limited observability by optimizing over a time window an order of magnitude larger than existing methods with significantly lower computational burden. The proposed method, which estimates accelerometer intrinsics and gravity separately from the other states, is enabled by a novel, velocity-agnostic measurement model for intrinsics and gravity, as well as a new method for gravity vector optimization on S2. Accurate IMU state prediction, gravity-alignment, and roll/pitch drift correction are experimentally demonstrated on public and self-collected datasets in diverse environments.
|
| |
| 09:18-09:24, Paper MoAT14.9 | Add to My Program |
| I2P-Rec: Recognizing Images on Large-Scale Point Cloud Maps through Bird's Eye View Projections |
|
| Zheng, Shuhang | Zhejiang University |
| Li, Yixuan | Zhejiang University |
| Yu, Zhu | Zhejiang University |
| Yu, Beinan | Zhejiang University |
| Cao, Siyuan | Zhejiang University |
| Wang, Minhang | HAOMO.AI Technology Co., Ltd |
| Xu, Jintao | HAOMO.AI Technology Co., Ltd |
| Ai, Rui | HAOMO.AI Technology Co., Ltd |
| Gu, Weihao | HAOMO.AI Technology Co., Ltd |
| Luo, Lun | Zhejiang University |
| Shen, Hui-liang | Zhejaing University |
Keywords: Localization, SLAM, Recognition
Abstract: Place recognition is an important technique for autonomous cars to achieve full autonomy since it can provide an initial guess to online localization algorithms. Although current methods based on images or point clouds have achieved satisfactory performance, localizing the images on a large-scale point cloud map remains a fairly unexplored problem. This cross-modal matching task is challenging due to the difficulty in extracting consistent descriptors from images and point clouds. In this paper, we propose the I2P-Rec method to solve the problem by transforming the cross-modal data into the same modality. Specifically, we leverage on the recent success of depth estimation networks to recover point clouds from images. We then project the point clouds into Bird's Eye View (BEV) images. Using the BEV image as an intermediate representation, we extract global features with a Convolutional Neural Network followed by a NetVLAD layer to perform matching. The experimental results evaluated on the KITTI dataset show that, with only a small set of training data, I2P-Rec achieves recall rates at Top-1% over 80% and 90%, when localizing monocular and stereo images on point cloud maps, respectively. We further evaluate I2P-Rec on a 1 km trajectory dataset collected by an autonomous logistics car and show that I2P-Rec can generalize well to previously unseen environments.
|
| |
| 09:24-09:30, Paper MoAT14.10 | Add to My Program |
| CoPR: Towards Accurate Visual Localization with Continuous Place-Descriptor Regression (I) |
|
| Zaffar, Mubariz | Delft University of Technology |
| Nan, Liangliang | TU Delft |
| Kooij, Julian Francisco Pieter | TU Delft |
Keywords: Localization, Mapping, SLAM, Visual Place Recognition
Abstract: Visual Place Recognition (VPR) is an image-based localization method that estimates the camera location of a query image by retrieving the most similar reference image from a map of geo-tagged reference images. In this work, we look into two fundamental bottlenecks for its localization accuracy: reference map sparseness and viewpoint invariance. Firstly, the reference images for VPR are only available at sparse poses in a map, which enforces an upper bound on the maximum achievable localization accuracy through VPR. We therefore propose Continuous Place-descriptor Regression (CoPR) to densify the map and improve localization accuracy. We study various interpolation and extrapolation models to regress additional place descriptors from only the existing references. Secondly, we compare different feature encoders and show that CoPR presents value for all of them. We evaluate our models on three existing public datasets and report on average around 30% improvement in VPR-based localization accuracy using CoPR, on top of the 15% increase by using a viewpoint-variant loss for the feature encoder. The complementary relation between CoPR and relative pose estimation is also discussed.
|
| |
| 09:30-09:36, Paper MoAT14.11 | Add to My Program |
| Complete Closed-Form and Accurate Solution to Pose Estimation from 3D Correspondences |
|
| Malis, Ezio | Inria |
Keywords: Localization, SLAM, Autonomous Vehicle Navigation
Abstract: Computing the pose from 3D data acquired in two different frames is a of high importance for several robotic tasks like odometry, SLAM and place recognition. The pose is generally obtained by solving a least-squares problem given points-to-points, points-to-planes or points to lines correspondences. The non-linear least-squares problem can be solved by iterative optimization or, more efficiently, in closed-form by using solvers of polynomial systems. In this paper, a complete and accurate closed-form solution for a weighted least-squares problem is proposed. Adding weights for each correspondence allow to increase robustness to outliers. Contrary to existing methods, the proposed approach is complete since it is able to solve the problem in any non-degenerate case and it is accurate since it is guaranteed to find the global optimal estimate of the weighted least-squares problem. Simulations and experiments on real data demonstrate the superior accuracy and robustness of the proposed algorithm compared to previous approaches.
|
| |
| 09:36-09:42, Paper MoAT14.12 | Add to My Program |
| Toward Consistent and Efficient Map-Based Visual-Inertial Localization: Theory Framework and Filter Design (I) |
|
| Zhang, Zhuqing | Zhejiang University |
| Song, Yang | University of Technology Sydney |
| Huang, Shoudong | University of Technology, Sydney |
| Xiong, Rong | Zhejiang University |
| Wang, Yue | Zhejiang University |
Keywords: Localization, Sensor Fusion, SLAM, Consistent Filter
Abstract: This paper focuses on designing a consistent and efficient filter for visual-inertial localization given a pre-built map. First, we propose a new Lie group with its algebra, based on which a novel invariant extended Kalman filter (invariant EKF) is designed. We theoretically prove that, when we do not consider the uncertainty of map information, the proposed invariant EKF is able to naturally preserve the correct observability properties of the system. To consider the uncertainty of map information, we introduce a Schmidt filter. With the Schmidt filter, the uncertainty of map information can be taken into consideration to avoid over-confident estimation while the computation cost only increases linearly with the size of the map keyframes. In addition, we introduce an easily implemented observability-constrained technique because directly combining the invariant EKF with the Schmidt filter cannot maintain the correct observability properties of the system that considers the uncertainty of map information. Finally, we validate our proposed system's high consistency, accuracy, and efficiency via extensive simulations and real world experiments.
|
| |
| 09:42-09:48, Paper MoAT14.13 | Add to My Program |
| WiFi Similarity-Based Odometry (I) |
|
| Ismail, Khairuldanial | Singapore University of Technology and Design |
| Liu, Ran | Southwest University of Science and Technology |
| Athukorala, Achala | Singapore University of Technology and Design |
| Ng, Benny Kai Kiat | Singapore University of Technology and Design |
| Yuen, Chau | Nanyang Technological University |
| Tan, U-Xuan | Singapore University of Techonlogy and Design |
Keywords: Localization
Abstract: Odometry is commonly used in localization applications especially with wheeled platforms since encoders are readily available. It is often used by itself or fused with other sensor data to obtain a better estimate. However, its limitation is its exclusivity to wheeled platforms whereas it is often desired to have similar encoder odometry options on other systems. Given that WiFi is ubiquitous in most commercial and industrial areas, in this paper, a method is proposed for obtaining odometry from WiFi scans for position estimation. The method is not constrained to wheel robots such as the case for wheeled odometry and does not rely on the traditional fingerprinting method. The proposed method involves training a neural network model to predict the distance moved based on features extracted from WiFi scans in the environment. These distances moved are then summed up to obtain the trajectory. Experiments are conducted and the methods are evaluated based on Root Mean Square Error (RMSE). Experimental results showed that the proposed method is able to achieve an RMSE of at most 8.39m for the various test cases.
|
| |
| MoAT15 Regular session, 321 |
Add to My Program |
| Sensor Fusion for SLAM |
|
| |
| Chair: Huang, Guoquan | University of Delaware |
| Co-Chair: Li, Lu | Carnegie Mellon University |
| |
| 08:30-08:36, Paper MoAT15.1 | Add to My Program |
| LIO-PPF: Fast LiDAR-Inertial Odometry Via Incremental Plane Pre-Fitting and Skeleton Tracking |
|
| Chen, Xingyu | Peking University |
| Wu, Peixi | Peking University |
| Li, Ge | Peking University Shenzhen Graduate School |
| Li, Thomas H. | Advanced Institute of Information Technology, Peking University; |
Keywords: SLAM, Mapping, Localization
Abstract: As a crucial infrastructure of intelligent mobile robots, LiDAR-Inertial odometry (LIO) provides the basic capability of state estimation by tracking LiDAR scans. The high-accuracy tracking generally involves the kNN search, which is used with minimizing the point-to-plane distance. The cost for this, however, is maintaining a large local map and performing kNN plane fit for each point. In this work, we reduce both time and space complexity of LIO by saving these unnecessary costs. Technically, we design a plane pre-fitting (PPF) pipeline to track the basic skeleton of the 3D scene. In PPF, planes are not fitted individually for each scan, let alone for each point, but are updated incrementally as the scene 'flows'. Unlike kNN, the PPF is more robust to noisy and non-strict planes with our iterative Principal Component Analyse (iPCA) refinement. Moreover, a simple yet effective sandwich layer is introduced to eliminate false point-to-plane matches. Our method was extensively tested on a total number of 22 sequences across 5 open datasets, and evaluated in 3 existing state-of-the-art LIO systems. By contrast, LIO-PPF can consume only 36% of the original local map size to achieve up to 4x faster residual computing and 1.92x overall FPS, while maintaining the same level of accuracy. We fully open source our implementation at https://github.com/xingyuuchen/LIO-PPF.
|
| |
| 08:36-08:42, Paper MoAT15.2 | Add to My Program |
| EDI: ESKF-Based Disjoint Initialization for Visual-Inertial SLAM Systems |
|
| Wang, Weihan | Stevens Institute of Technology |
| Li, Jiani | Vanderbilt University |
| Ming, Yuhang | Hangzhou Dianzi University |
| Mordohai, Philippos | Stevens Institute of Technology |
Keywords: Visual-Inertial SLAM, SLAM, Localization
Abstract: Visual-inertial initialization can be classified into joint and disjoint approaches. Joint approaches tackle both the visual and the inertial parameters together by aligning observations from feature-bearing points based on IMU integration then use a closed-form solution with visual and acceleration observations to find initial velocity and gravity. In contrast, disjoint approaches independently solve the Structure from Motion (SFM) problem and determine inertial parameters from up-to-scale camera poses obtained from pure monocular SLAM. However, previous disjoint methods have limitations, like assuming negligible acceleration bias impact or accurate rotation estimation by pure monocular SLAM. To address these issues, we propose EDI, a novel approach for fast, accurate, and robust visual-inertial initialization. Our method incorporates an Error-state Kalman Filter (ESKF) to estimate gyroscope bias and correct rotation estimates from monocular SLAM, overcoming dependence on pure monocular SLAM for rotation estimation. To estimate the scale factor without prior information, we offer a closed-form solution for initial velocity, scale, gravity, and acceleration bias estimation. To address gravity and acceleration bias coupling, we introduce weights in the linear least-squares equations, ensuring acceleration bias observability and handling outliers. Extensive evaluation on the EuRoC dataset shows that our method achieves an average scale error of 5.8% in less than 3 seconds, outperforming other state-of-the-art disjoint visual-inertial initialization approaches, even in challenging environments and with artificial noise corruption.
|
| |
| 08:42-08:48, Paper MoAT15.3 | Add to My Program |
| SELVO: A Semantic-Enhanced Lidar-Visual Odometry |
|
| Jiang, Kun | UCAS |
| Gao, Shuang | OPPO Research Institute |
| Zhang, Xudong | OPPO Research Institute |
| Li, Jijunnan | OPPO Research Institute |
| Guo, Yandong | OPPO Research Institute |
| Shijie, Liu | Hangzhou Institute for Advanced Study, UCAS |
| Li, Chunlai | Shanghai Institute of Technical Physics (SITP) , Chinese Academy |
| Wang, Jianyu | Shanghai Institute of Technical Physics of the Chinese Academy O |
Keywords: SLAM, Localization, Computer Vision for Automation
Abstract: In the face of complex external environment, single sensor information can no longer meet the accuracy requirements of low-drift SLAM. In this paper, we focus on the fusion scheme of cameras and lidar, and explore the gain of semantic information to SLAM system. A Semantic-Enhanced Lidar-Visual Odometry (SELVO) is proposed to achieve pose estimation with high accuracy and robustness by applying semantics and utilizing strategies of initialization and sensor fusion. In loop closure detection thread, we propose a novel place recognition method based on semantic information to maintain the global consistency of the map. In the back-end, we design a joint optimization framework including visual odometry, lidar odometry and loop closure detection, and innovatively propose to recognize degraded scenes with semantic information. We have conducted a large number of experiments on KITTI and KITTI-360 dataset, and the results show that our system can achieve the high accuracy and competitive performance in comparison with state-of-the-art methods.
|
| |
| 08:48-08:54, Paper MoAT15.4 | Add to My Program |
| LIWO: Lidar-Inertial-Wheel Odometry |
|
| Yuan, Zikang | Huazhong University, Wuhan, 430073, China |
| Lang, Fengtian | Huazhong University of Science and Technology |
| Xu, Tianle | Huazhong University of Science and Technology |
| Yang, Xin | Huazhong University of Science and Technology |
Keywords: SLAM, Localization
Abstract: LiDAR-inertial odometry (LIO), which fuses complementary information of a LiDAR and an Inertial Measurement Unit (IMU), is an attractive solution for state estimation.In LIO, both pose and velocity are regarded as state variables that need to be solved. However, the widely-used Iterative Closest Point (ICP) algorithm can only provide constraint for pose, while the velocity can only be constrained by IMU pre-integration. As a result, the velocity estimates inclined to be updated accordingly with the pose results. In this paper, we propose LIWO, an accurate and robust LiDAR-inertialwheel (LIW) odometry, which fuses the measurements from LiDAR, IMU and wheel encoder in a bundle adjustment (BA) based optimization framework. The involvement of a wheel encoder could provide velocity measurement as an important observation, which assists LIO to provide a more accurate state prediction. In addition, constraining the velocity variable by the observation from wheel encoder in optimization can further improve the accuracy of state estimation. Experiment results on two public datasets demonstrate that our system outperforms all state-of-the-art LIO systems in terms of smaller absolute trajectory error (ATE), and embedding a wheel encoder can greatly improve the performance of LIO based on the BA framework.
|
| |
| 08:54-09:00, Paper MoAT15.5 | Add to My Program |
| VIW-Fusion: Extrinsic Calibration and Pose Estimation for Visual-IMU-Wheel Encoder System |
|
| Qiao, Chunxiao | Northeastern University, College of Information Science and Engi |
| Zhao, Shuying | Northeastern University |
| Zhang, Yunzhou | Northeastern University |
| Wang, Yahui | UISEE (Beijing) Ltd |
| Zhang, Dan | Uisee Technology (Beijing) Co., Ltd |
Keywords: Visual-Inertial SLAM, Localization, Sensor Fusion
Abstract: The data fusion of camera, IMU, and wheel encoder measurements has proved its effectiveness in localizing ground robots, and obtaining accurate sensor extrinsic parameters is its premise. We propose an extrinsic parameter calibration algorithm and a multi-sensor-based pose estimation algorithm for the camera-IMU-wheel encoder system. First, we propose a joint calibration algorithm for the extrinsic parameters of the camera-IMU-wheel encoder system, which improves the accuracy and robustness of the camera-wheel encoder calibration. We then extend the visual-inertial odometry (VIO) to incorporate the measurements from the wheel encoder and weight the wheel encoder measurements according to angular velocity in global optimization to improve the performance. We further propose a novel method for VIO initialization by integrating wheel encoder information, which significantly reduces the scale error in initialization. We conduct extrinsic parameter calibration experiments on a real self-driving car and validate the performance of our multi-sensor-based localization system on the KAIST dataset and a dataset collected by our self-driving vehicles by performing an exhaust comparison with the state-of-the-art algorithms. Our implementations are open source https://github.com/chunxiaoqiao/VIW-Fusion.git.
|
| |
| 09:00-09:06, Paper MoAT15.6 | Add to My Program |
| LiDAR-Inertial SLAM with Efficiently Extracted Planes |
|
| Chen, Chao | Zhejiang University |
| Wu, Hangyu | Zhejiang University |
| Ma, Yukai | Zhejiang Unicersity |
| Lv, Jiajun | Zhejiang University |
| Li, Laijian | Zhejiang University |
| Liu, Yong | Zhejiang University |
Keywords: Mapping, Localization, SLAM
Abstract: This paper proposes a LiDAR-Inertial SLAM with efficiently extracted planes, which couples the planes in the odometry to improve accuracy and in the mapping for consistency. The proposed method consists of three parts: an efficient PointtoLinetoPlane extraction algorithm, a LiDAR-Inertial-Plane tightly coupled odometry, and plane-aided mapping with global planes. Specifically, we leverage the ring field of the LiDAR point cloud to accelerate the region-growing-based plane extraction algorithm. We propose a plane-distance-insensitive criterion for better plane association. We tightly coupled the IMU pre-integration factor, LiDAR odometry factor, and plane factor in the odometry to obtain a more accurate initial pose for mapping. Furthermore, we propose a plane map management strategy based on spatial voxel hashing to improve the speed and accuracy of global map plane associations.Experimental results show that our plane extraction method is efficient, and the proposed plane-aided LiDAR-Inertial SLAM significantly improves the accuracy and consistency compared to the other state-of-the-art algorithms with only a small increase in time consumption.
|
| |
| 09:06-09:12, Paper MoAT15.7 | Add to My Program |
| Learning to Map Efficiently by Active Echolocation |
|
| Hu, Xixi | UT Austin |
| Purushwalkam, Senthil | Salesforce Research |
| Harwath, David | UT Austin |
| Grauman, Kristen | UT Austin and Facebook AI Research |
Keywords: Audio-Visual SLAM, SLAM
Abstract: Using visual SLAM to map new environments requires time-consuming visits to all regions for data collection. We propose an approach to estimate maps of areas beyond the visible regions using a cheap and readily available modality of data---sound. We introduce the idea of an active audio-visual mapping agent. Besides collecting visual data, the proposed agent emits sounds during navigation, captures the echoes, and uses them to accurately map unknown areas. We propose a reinforcement learning-based method that simultaneously trains models to 1) estimate a map from the visual data, 2) output navigation actions, 3) output the decision to emit a sound and 4) refine estimated maps using the captured audio. Our agent is trained and tested on 85 real-world homes from the Matterport3D dataset using the Habitat and SoundSpaces simulators for visual and audio data. Our method, unlike visual-data reliant approaches, yields more accurate maps with broader environmental coverage. In addition, compared to an agent that continually emits sounds, we observe that intelligently choosing emph{when} to emit sounds leads to accurate maps obtained with greater efficiency.
|
| |
| 09:12-09:18, Paper MoAT15.8 | Add to My Program |
| Visual-LiDAR-Inertial Odometry: A New Visual-Inertial SLAM Method Based on an iPhone 12 Pro |
|
| Ye, Cang | Virginia Commonwealth University |
| Jin, Lingqiu | Virginia Commonwealth University |
Keywords: Visual-Inertial SLAM, Range Sensing
Abstract: As today�s smartphone integrates various imaging sensors and Inertial Measurement Units (IMU) and becomes computationally powerful, there is a growing interest in developing smartphone-based visual-inertial (VI) SLAM methods for robotics and computer vision applications. In this paper, we introduce a new SLAM method, called Visual-LiDAR-Inertial Odometry (VLIO), based on an iPhone 12 Pro. VLIO formulates device pose estimation as an optimization problem that minimizes a cost function based on the residuals of the inertial, visual, and depth measurements. We present the first work that 1) characterizes the iPhone�s LiDAR in depth measurement and identifies the models for the measurement error and standard deviation, and 2) characterizes pose change estimation with LiDAR data. The measurement models are then used to compute the depth-related and visual-feature-related residuals for the cost function. Also, VLIO tracks varying camera intrinsic parameters (CIP) in real-time and uses them in computing these residuals. Both approaches result in more accurate residual terms and thus more accurate pose estimation. The CIP tracking method eliminates the need of a sophisticated model-fitting process that includes camera calibration and paring of the CIPs and IMU measurements with various phone orientations. Experimental results validate the efficacy of VLIO.
|
| |
| 09:18-09:24, Paper MoAT15.9 | Add to My Program |
| Optimization-Based VINS: Consistency, Marginalization, and FEJ |
|
| Chen, Chuchu | University of Delaware |
| Geneva, Patrick | University of Delaware |
| Peng, Yuxiang | University of Delaware |
| Lee, Woosik | University of Delaware |
| Huang, Guoquan | University of Delaware |
Keywords: Visual-Inertial SLAM, Localization, SLAM
Abstract: In this work, we present a comprehensive analysis of the application of the First-estimates Jacobian (FEJ) design methodology in nonlinear optimization-based Visual-Inertial Navigation Systems (VINS). The FEJ approach fixes system linearization points to preserve proper observability properties of VINS and has been shown to significantly improve the estimation performance of state-of-the-art filtering-based methods. However, its direct application to optimization-based estimators holds challenges and pitfalls, which we addressed in this paper. Specifically, we carefully examine the observability and its relation to inconsistency and FEJ, based on this, we explain how to properly apply and implement FEJ within four marginalization archetypes commonly used in non-linear optimization-based frameworks. FEJ's effectiveness and applications to VINS are investigated and demonstrate significant performance improvements. Additionally, we offer a detailed discussion of results and guidelines on how to properly implement FEJ in optimization-based estimators.
|
| |
| 09:24-09:30, Paper MoAT15.10 | Add to My Program |
| Visual-Inertial-Laser-Lidar (VILL) SLAM: Real-Time Dense RGB-D Mapping for Pipe Environments |
|
| Tian, Yu | Carnegie Mellon University |
| Wang, Luyuan | Carnegie Mellon University |
| Yan, Xinzhi | Carnegie Mellon University |
| Ruan, Fujun | Carnegie Mellon University |
| Ganapathy Subramanian, Jaya Aadityaa | Carnegie Mellon University |
| Choset, Howie | Carnegie Mellon University |
| Li, Lu | Carnegie Mellon University |
Keywords: Visual-Inertial SLAM, RGB-D Perception, Sensor Fusion
Abstract: Robotic solutions for pipeline inspection promise enhancement of human labor by automating data acquisition for pipe condition assessments, which are vital for the early detection of pipe anomalies and the prevention of hazardous leakages and explosions. Through simultaneous localization and mapping (SLAM), colorized 3D reconstructions of the pipe's inner surface can be generated, providing a more comprehensive digital record of the pipes compared to conventional vision-only inspection. Designed for generic environments, most SLAM methods suffer limited accuracy and substantial accumulative drift in confined and featureless spaces such as pipelines, due to a lack of suitable sensor hardware and state estimation techniques. In this research, we present VILL-SLAM: a dense RGB-D SLAM algorithm that combines a monocular camera (V), an inertial sensor (I), a ring-shaped laser profiler (L), and a Lidar (L) into a compact sensor package optimized for in-pipe operations. By fusing complementary visual and depth information from the color camera, laser profiling, and Lidar measurement, our method overcomes the challenges of metric scale mapping in conventional SLAM methods, despite its monocular configuration. To further improve localization accuracy, we utilize the pipe geometry to formulate two unique optimization factors that effectively constrain odometer drift. To validate our method, we conducted real-world experiments in physical pipes, comparing the performance of our approach against other state-of-the-art algorithms. The proposed SLAM framework achieved 6.6 times drift improvement with 0.84% mean odometry drift over 22 meters and a mean pointwise 3D scanning error of 0.88mm in 12-inch diameter pipes. This research represents a significant advancement in miniature in-pipe inspection, localization, and mapping sensing techniques. It has the potential to become a core enabling technology for the next generation of highly capable in-pipe robots, capable of reconstructing photo-realistic 3D pipe scans and providing disruptive pipe locating and georeferencing capabilities.
|
| |
| 09:30-09:36, Paper MoAT15.11 | Add to My Program |
| Know What You Don't Know: Consistency in Sliding Window Filtering with Unobservable States Applied to Visual-Inertial SLAM |
|
| Lisus, Daniil | University of Toronto |
| Cohen, Mitchell | McGill University |
| Forbes, James Richard | McGill University |
Keywords: Visual-Inertial SLAM, Autonomous Vehicle Navigation, SLAM
Abstract: Estimation algorithms, such as the sliding window filter, produce an estimate and uncertainty of desired states. This task becomes challenging when the problem involves unobservable states. In these situations, it is critical for the algorithm to ``know what it doesn't know'', meaning that it must maintain the unobservable states as unobservable during algorithm deployment. This letter presents general requirements for maintaining consistency in sliding window filters involving unobservable states. The value of these requirements when designing a navigation solution is experimentally shown within the context of visual-inertial SLAM making use of IMU preintegration.
|
| |
| 09:36-09:42, Paper MoAT15.12 | Add to My Program |
| Versatile LiDAR-Inertial Odometry with SE(2) Constraints for Ground Vehicles |
|
| Jiaying, Chen | Nanyang Technological University |
| Wang, Han | Nanyang Technological University |
| Hu, Minghui | Nanyang Technological University |
| Suganthan, Ponnuthurai Nagaratnam | Nanyang Technological University |
Keywords: SLAM, Localization, Industrial Robots
Abstract: LiDAR SLAM has become one of the major localization systems for ground vehicles since LiDAR Odometry And Mapping (LOAM). Many extension works on LOAM mainly leverage one specific constraint to improve the performance, e.g., information from on-board sensors such as loop closure and inertial state; prior conditions such as ground level and motion dynamics. In many robotic applications, these conditions are often known partially, hence a SLAM system can be a comprehensive problem due to the existence of numerous constraints. Therefore, we can achieve a better SLAM result by fusing them properly. In this paper, we propose a hybrid LiDAR-inertial SLAM framework that leverages both the on-board perception system and prior information such as motion dynamics to improve localization performance. In particular, we consider the case for ground vehicles, which are commonly used for autonomous driving and warehouse logistics. We present a computationally efficient LiDAR-inertial odometry method that directly parameterizes ground vehicle poses on SE(2). The out-of-SE(2) motion perturbations are not neglected but incorporated into an integrated noise term of a novel SE(2)-constraints model. For odometric measurement processing, we propose a versatile, tightly coupled LiDAR-inertial odometry to achieve better pose estimation than traditional LiDAR odometry.
|
| |
| 09:42-09:48, Paper MoAT15.13 | Add to My Program |
| ESVIO: Event-Based Stereo Visual Inertial Odometry |
|
| Chen, Peiyu | The University of Hong Kong |
| Guan, Weipeng | The University of Hong Kong |
| Lu, Peng | The University of Hong Kong |
Keywords: Visual-Inertial SLAM, Sensor Fusion, Aerial Systems: Perception and Autonomy
Abstract: Event cameras that asynchronously output low-latency event streams provide great opportunities for state estimation under challenging situations. Despite event-based visual odometry having been extensively studied in recent years, most of them are based on monocular and few research on stereo event vision. In this paper, we present ESVIO, the first event-based stereo visual-inertial odometry, which leverages the complementary advantages of event streams, standard images and inertial measurements. Our proposed pipeline achieves spatial and temporal associations between consecutive stereo event streams, thereby obtaining robust state estimation. In addition, the motion compensation method is designed to emphasize the edge of scenes by warping each event to reference moments with IMU and ESVIO back-end. We validate that both ESIO (purely event-based) and ESVIO (event with image-aided) have superior performance compared with other image-based and event-based baseline methods on public and self-collected datasets. Furthermore, we use our pipeline to perform onboard quadrotor flights under low-light environments. A real-world large-scale experiment is also conducted to demonstrate long-term effectiveness. We highlight that this work is a real-time, accurate system that is aimed at robust state estimation under challenging environments.
|
| |
| MoAT16 Regular session, 330A |
Add to My Program |
| Autonomous Agents |
|
| |
| Chair: Xiao, Jing | Worcester Polytechnic Institute (WPI) |
| Co-Chair: Keren, Sarah | Technion - Israel Institute of Technology |
| |
| 08:30-08:36, Paper MoAT16.1 | Add to My Program |
| Catch Me If You Hear Me: Audio-Visual Navigation in Complex Unmapped Environments with Moving Sounds |
|
| Younes, Abdelrahman | KIT |
| Honerkamp, Daniel | Albert Ludwigs Universit�t Freiburg |
| Welschehold, Tim | Albert-Ludwigs-Universit�t Freiburg |
| Valada, Abhinav | University of Freiburg |
Keywords: Autonomous Agents, Reactive and Sensor-Based Planning, Reinforcement Learning
Abstract: Audio-visual navigation combines sight and hearing to navigate to a sound-emitting source in an unmapped environment. While recent approaches have demonstrated the benefits of audio input to detect and find the goal, they focus on clean and static sound sources and struggle to generalize to unheard sounds. In this work, we propose the novel dynamic audio-visual navigation benchmark which requires catching a moving sound source in an environment with noisy and distracting sounds, posing a range of new challenges. We introduce a reinforcement learning approach that learns a robust navigation policy for these complex settings. To achieve this, we propose an architecture that fuses audio-visual information in the spatial feature space to learn correlations of geometric information inherent in both local maps and audio signals. We demonstrate that our approach consistently outperforms the current state-of-the-art by a large margin across all tasks of moving sounds, unheard sounds, and noisy environments, on two challenging 3D scanned real-world environments, namely Matterport3D and Replica. The benchmark is available at http://dav-nav.cs.uni-freiburg.de.
|
| |
| 08:36-08:42, Paper MoAT16.2 | Add to My Program |
| Joint Imitation Learning of Behavior Decision and Control for Autonomous Intersection Navigation |
|
| Zhu, Zeyu | Key Labarotary of Machine Perception, Peking University |
| Zhao, Huijing | Peking University |
Keywords: Autonomous Agents, Control Architectures and Programming
Abstract: Modern autonomous driving systems face substantial challenges when navigating dense intersections due to the high uncertainty introduced by other road users. Due to the complexity of the task, the autonomous vehicle needs to generate policies at multiple levels of abstraction. However, previous deep imitation learning methods focused on learning control policies while using simple rule-based behavior models. To bridge this gap and achieve human-like driving, we develop a hierarchy of high-level behavior decision and low-level control, where both policies are jointly learned from human demonstrations based on imitation learning. Over 60 hours of driving data from 10 drivers at six intersections was collected. The proposed method is extensively evaluated in challenging intersection scenarios. Empirical results demonstrate the method's superior performance over baselines in terms of task completion and control quality. We demonstrate the importance of learning human-like behavior decisions as well as joint learning of behavior and control policies. The capability of imitating different driving styles is also illustrated.
|
| |
| 08:42-08:48, Paper MoAT16.3 | Add to My Program |
| Improving the Performance of Backward Chained Behavior Trees That Use Reinforcement Learning |
|
| Karta�ev, Mart | KTH Royal Institute of Technology |
| Sal�r, Justin | KTH |
| Ogren, Petter | Royal Institute of Technology (KTH) |
Keywords: Behavior-Based Systems, Autonomous Agents, Control Architectures and Programming
Abstract: In this paper we show how to improve the performance of backward chained behavior trees (BTs) that include policies trained with reinforcement learning (RL). BTs represent a hierarchical and modular way of combining control policies into higher level control policies. Backward chaining is a design principle for the construction of BTs that combines reactivity with goal directed actions in a structured way. The backward chained structure has also enabled convergence proofs for BTs, identifying a set of local conditions to be satisfied for the convergence of all trajectories to a set of desired goal states. The key idea of this paper is to improve performance of backward chained BTs by using the conditions identified in a theoretical convergence proof to configure the RL problems for individual controllers. Specifically, previous analysis identified so-called active constraint conditions (ACCs), that should not be violated in order to avoid having to return to work on previously achieved subgoals. We propose a way to set up the RL problems, such that they do not only achieve each immediate subgoal, but also avoid violating the identified ACCs. The resulting performance improvement depends on how often ACC violations occurred before the change, and how much effort, in terms of execution time, was needed to re-achieve them. The proposed approach is illustrated in a dynamic simulation environment.
|
| |
| 08:48-08:54, Paper MoAT16.4 | Add to My Program |
| Fast Decision Support for Air Traffic Management at Urban Air Mobility Vertiports Using Graph Learning |
|
| KrisshnaKumar, Prajit | University at Buffalo |
| Witter, Jhoel | University at Buffalo |
| Paul, Steve | University at Buffalo |
| Cho, Hanvit | State University of New York at Buffalo |
| Dantu, Karthik | University of Buffalo |
| Chowdhury, Souma | University at Buffalo, State University of New York |
Keywords: Intelligent Transportation Systems, Multi-Robot Systems, Reinforcement Learning
Abstract: Urban Air Mobility (UAM) promises a new dimension to decongested, safe, and fast travel in urban and suburban hubs. These UAM aircraft are conceived to operate from small airports called vertiports each comprising multiple take-off/landing and battery-recharging spots. Since they might be situated in dense urban areas and need to handle many aircraft landings and take-offs each hour, managing this schedule in real-time becomes challenging for a traditional air-traffic controller but instead calls for an automated solution. This paper provides a novel approach to this problem of Urban Air Mobility - Vertiport Schedule Management (UAM-VSM), which leverages graph reinforcement learning to generate decision-support policies. Here the designated physical spots within the vertiport's airspace and the vehicles being managed are represented as two separate graphs, with feature extraction performed through a graph convolutional network (GCN). Extracted features are passed onto perceptron layers to decide actions such as continue to hover or cruise, continue idling or take-off, or land on an allocated vertiport spot. Performance is measured based on delays, safety (no. of collisions) and battery consumption. Through realistic simulations in AirSim applied to scaled down multi-rotor vehicles, our results demonstrate the suitability of using graph reinforcement learning to solve the UAM-VSM problem and its superiority to basic reinforcement learning (with graph embeddings) or random choice baselines.
|
| |
| 08:54-09:00, Paper MoAT16.5 | Add to My Program |
| Scaling Vision-Based End-To-End Autonomous Driving with Multi-View Attention Learning |
|
| Xiao, Yi | Computer Vision Center, Universitat Aut�noma De Barcelona |
| Codevilla, Felipe | Mila/ Independent Robotics |
| Porres, Diego | Computer Vision Center, Universitat Aut�noma De Barcelona |
| Lopez, Antonio M. | Computer Vision Center, Universitat Autonoma De Barcelona |
Keywords: Autonomous Agents, Imitation Learning, Intelligent Transportation Systems
Abstract: On end-to-end driving, human driving demonstrations are used to train perception-based driving models by imitation learning. This process is supervised on vehicle signals (e.g., steering angle, acceleration) but does not require extra costly supervision (human labeling of sensor data). As a representative of such vision-based end-to-end driving models, CILRS is commonly used as a baseline to compare with new driving models. So far, some latest models achieve better performance than CILRS by using expensive sensor suites and/or by using large amounts of human-labeled data for training. Given the difference in performance, one may think that it is not worth pursuing vision-based pure end-to-end driving. However, we argue that this approach still has great value and potential considering cost and maintenance. In this paper, we present CIL++, which improves on CILRS by both processing higher-resolution images using a human-inspired HFOV as an inductive bias and incorporating a proper attention mechanism. CIL++ achieves competitive performance compared to models which are more costly to develop. We propose to replace CILRS with CIL++ as a strong vision-based pure end-to-end driving baseline supervised by only vehicle signals and trained by conditional imitation learning.
|
| |
| 09:00-09:06, Paper MoAT16.6 | Add to My Program |
| Value of Assistance for Mobile Agents |
|
| Amuzig, Adi | Technion - Israel Institute of Technology |
| Dovrat, David | Technion |
| Keren, Sarah | Technion - Israel Institute of Technology |
Keywords: Autonomous Agents, Probability and Statistical Methods, Localization
Abstract: Mobile robotic agents often suffer from localization uncertainty which grows with time and with the agents' movement. This can hinder their ability to accomplish their task. In some settings, it may be possible to perform assistive actions that reduce uncertainty about a robot�s location. For example, in a collaborative multi-robot system, a wheeled robot can request assistance from a drone that can fly to its estimated location and reveal its exact location on the map or accompany it to its intended location. Since assistance may be costly and limited, and may be requested by different members of a team, there is a need for principled ways to support the decision of which assistance to provide to an agent and when, as well as to decide which agent to help within a team. For this purpose, we propose Value of Assistance (VOA) to represent the expected cost reduction that assistance will yield at a given point of execution. We offer ways to compute VOA based on estimations of the robot's future uncertainty, modeled as a Gaussian process. We specify conditions under which our VOA measures are valid and empirically demonstrate the ability of our measures to predict the agent's average cost reduction when receiving assistance in both simulated and real-world robotic settings.
|
| |
| 09:06-09:12, Paper MoAT16.7 | Add to My Program |
| Feature Explanation for Robust Trajectory Prediction |
|
| Zhai, Xukai | Wuhan University of Technology |
| Hu, Renze | Wuhan University of Technology |
| Yin, Zhishuai | Wuhan Universuty of Technology |
Keywords: Autonomous Agents, AI-Based Methods, Deep Learning Methods
Abstract: Trajectory prediction of neighboring agents is a critical task for high-speed robotics such as autonomous vehicles. In order to obtain fine-grained and robust scene representations, existing works attempt to consider abundant information that is deemed relevant. The cost, however, is the heavy computational burden and more importantly the inevitable interference brought by redundant information. In this paper, we exploit the explainable AI (XAI) techniques and propose a model in the framework of "Encoder-Decoder" named parallel explainable Transformer (PXT) to identify the contributive features for robust trajectory prediction. A two-branch encoder is designed to disentangle the roadway information and agents� historical trajectories for better feature explanation. Two stages of feature explanation are incorporated into the encoder. In the first stage, an explainable Transformer (XT) comprising a Layer-wise Relevance Propagation (LRP)-based interpretation module is designed and implemented in both branches to score and filter the contextual and motion features. In the second stage of interpretation, the ProbSparse attention mechanism is innovatively adopted to measure the level of interactivity with sparsity, so that the relationships among highly interactive agents are focused on. The results on the Argoverse Benchmark show that our proposal achieves state-of-the-art (SOTA) performance without delicate and tedious network design, demonstrating the effectiveness of tracing and retaining contributive features in enhancing the performance of trajectory prediction.
|
| |
| 09:12-09:18, Paper MoAT16.8 | Add to My Program |
| Adversarial Driving Behavior Generation Incorporating Human Risk Cognition for Autonomous Vehicle Evaluation |
|
| Liu, Zhen | Jilin University |
| Gao, Hang | Jilin University |
| Ma, Hao | Jilin University |
| Cai, Shuo | Jilin University |
| Hu, Yunfeng | Jilin University |
| Qu, Ting | Jilin University |
| Chen, Hong | Tongji University |
| Gong, Xun | Jilin University |
Keywords: Autonomous Agents, Cognitive Modeling, Reinforcement Learning
Abstract: Autonomous vehicle (AV) evaluation has been the subject of increased interest in recent years both in industry and in academia. This paper focuses on the development of a novel framework for generating adversarial driving behavior of background vehicle interfering against the AV to expose effective and rational risky events. Specifically, the adversarial behavior is learned by a reinforcement learning (RL) approach incorporated with the cumulative prospect theory (CPT) which allows representation of human risk cognition. Then, the extended version of deep deterministic policy gradient (DDPG) technique is proposed for training the adversarial policy while ensuring training stability as the CPT action-value function is leveraged. A comparative case study regarding the cut-in scenario is conducted on a high fidelity Hardware-in-the-Loop (HiL) platform and the results demonstrate the adversarial effectiveness to infer the weakness of the tested AV.
|
| |
| 09:18-09:24, Paper MoAT16.9 | Add to My Program |
| Predicting Center of Mass by Iterative Pushing for Object Transportation and Manipulation |
|
| Hyland, Steven Michael | Worcester Polytechnic Institute |
| Xiao, Jing | Worcester Polytechnic Institute (WPI) |
| Onal, Cagdas | WPI |
Keywords: Autonomous Agents, Wheeled Robots, Manipulation Planning
Abstract: Robotic manipulation tasks rely on a plethora of environmental and payload information. One critical piece of information for accurate manipulation is the center of mass (CoM) of the object, which is essential for estimating the dynamic response of the system and determining the payload placement. Traditionally, the CoM of a payload is provided prior to manipulation. In order to create a more robust and comprehensive system, this information should be collected by the robotic agent before or during the task run time. This paper presents a method for approximating the CoM of a planar object using a small-scale mobile robot to inform manipulation tasks. On average, our system is able to converge on a CoM estimate in under 30 seconds in simulation and 20 seconds in experiment, with a relative error of 4.95% and 5.46%, respectively.
|
| |
| 09:24-09:30, Paper MoAT16.10 | Add to My Program |
| The Impact of Overall Optimization on Warehouse Automation |
|
| Yoshitake, Hiroshi | Hitachi America Ltd |
| Abbeel, Pieter | UC Berkeley |
Keywords: Discrete Event Dynamic Automation Systems, Reinforcement Learning, Multi-Robot Systems
Abstract: In this study, we propose a novel approach for investigating optimization performance by flexible robot coordination in automated warehouses with multi-agent reinforcement learning (MARL)-based control. Automated systems using robots are expected to achieve efficient operations compared with manual systems in terms of overall optimization performance. However, the impact of overall optimization on performance remains unclear in most automated systems due to a lack of suitable control methods. Thus, we proposed a centralized training-and-decentralized execution MARL framework as a practical overall optimization control method. In the proposed framework, we also proposed a single shared critic, trained with global states and rewards, applicable to a case in which heterogeneous agents make decisions asynchronously. Our proposed MARL framework was applied to the task selection of material handling equipment through automated order picking simulation, and its performance was evaluated to determine how far overall optimization outperforms partial optimization by comparing it with other MARL frameworks and rule-based control methods.
|
| |
| 09:30-09:36, Paper MoAT16.11 | Add to My Program |
| Kinematics-Only Differential Flatness Based Trajectory Tracking for Autonomous Racing |
|
| Dighe, Yashom | University at Buffalo, State University of New York |
| Kim, Youngjin | University at Buffalo |
| Rajguru, Smit | State University of New York at Buffalo |
| Turkar, Yash | University at Buffalo |
| Singh, Tarunraj | University at Buffalo |
| Dantu, Karthik | University of Buffalo |
Keywords: Autonomous Agents, Wheeled Robots, Kinematics
Abstract: In autonomous racing, accurately tracking the race line at the limits of handling is essential to guarantee competitiveness. In this study, we show the effectiveness of Differential Flatness based control for high-speed trajectory tracking for car-like robots. We compare the tracking performance of our controller against Nonlinear Model Predictive Control and resource use while running on embedded hardware and show that on average KFC reduces the computation resource usage by 50 % while performing on par with NMPC. Our implementation of the proposed controller, the simulation environment and detailed results is open-sourced on https://github.com/droneslab/
|
| |
| 09:36-09:42, Paper MoAT16.12 | Add to My Program |
| LEF: Late-To-Early Temporal Fusion for LiDAR 3D Object Detection |
|
| He, Tong | Waymo LLC |
| Sun, Pei | Waymo |
| Leng, Zhaoqi | Waymo LLC |
| Liu, Chenxi | Waymo |
| Anguelov, Dragomir | Waymo |
| Tan, Mingxing | Waymo Research |
Keywords: Autonomous Agents, Object Detection, Segmentation and Categorization, Semantic Scene Understanding
Abstract: We propose a late-to-early recurrent feature fusion scheme for 3D object detection using temporal LiDAR point clouds. Our main motivation is fusing object-aware latent embeddings into the early stages of a 3D object detector. This feature fusion strategy enables the model to better capture the shapes and poses for challenging objects, compared with learning from raw points directly. Our method conducts late-to-early feature fusion in a recurrent manner. This is achieved by enforcing window-based attention blocks upon temporally calibrated and aligned sparse pillar tokens. Leveraging bird's eye view foreground pillar segmentation, we reduce the number of sparse history features that our model needs to fuse into its current frame by 10x. We also propose a stochastic-length FrameDrop training technique, which generalizes the model to variable frame lengths at inference for improved performance without retraining. We evaluate our method on the widely adopted Waymo Open Dataset and demonstrate improvement on 3D object detection against the baseline model, especially for the challenging category of large objects.
|
| |
| 09:42-09:48, Paper MoAT16.13 | Add to My Program |
| Learning Behavior Trees from Planning Experts Using Decision Tree and Logic Factorization |
|
| Gugliermo, Simona | �rebro Univeristy, Scania |
| Schaffernicht, Erik | �rebro University, AASS Research Center |
| Koniaris, Christos | Scania |
| Pecora, Federico | Amazon Robotics |
Keywords: Behavior-Based Systems, Learning from Demonstration, Intelligent Transportation Systems
Abstract: The increased popularity of Behavior Trees (BTs) in different fields of robotics requires efficient methods for learning BTs from data instead of tediously handcrafting them. Recent research in learning from demonstration reported encouraging results that this paper extends, improves and generalizes to arbitrary planning domains. We propose BT-Factor as a new method for learning expert knowledge by representing it in a BT. Execution traces of previously manually designed plans are used to generate a BT employing a combination of decision tree learning and logic factorization techniques originating from circuit design. We test BT-Factor in an industrially-relevant simulation environment from a mining scenario and compare it against a state-of-the-art BT learning method. The results show that our method generates compact BTs easy to interpret, and capable to capture accurately the relations that are implicit in the training data.
|
| |
| MoAT17 Regular session, 330B |
Add to My Program |
| Imitation Learning |
|
| |
| Chair: Igl, Maximilian | Waymo LLC |
| Co-Chair: Cui, Yuchen | Stanford University |
| |
| 08:30-08:36, Paper MoAT17.1 | Add to My Program |
| Learning from Guided Play: Improving Exploration for Adversarial Imitation Learning with Simple Auxiliary Tasks |
|
| Ablett, Trevor | University of Toronto |
| Chan, Bryan | University of Alberta |
| Kelly, Jonathan | University of Toronto |
Keywords: Imitation Learning, Reinforcement Learning, Transfer Learning
Abstract: Adversarial imitation learning (AIL) has become a popular alternative to supervised imitation learning that reduces the distribution shift suffered by the latter. However, AIL requires effective exploration during an online reinforcement learning phase. In this work, we show that the standard, naı̈ve approach to exploration can manifest as a suboptimal local maximum if a policy learned with AIL sufficiently matches the expert distribution without fully learning the desired task. This can be particularly catastrophic for manipulation tasks, where the difference between an expert and a non-expert state-action pair is often subtle. We present Learning from Guided Play (LfGP), a framework in which we leverage expert demonstrations of multiple exploratory auxiliary tasks in addition to a main task. The addition of these auxiliary tasks forces the agent to explore states and actions that standard AIL may learn to ignore. Additionally, this particular formulation allows the reusability of expert data between main tasks. Our experimental results in a challenging multitask robotic manipulation domain indicate that LfGP significantly outperforms both AIL and BC, while also being more expert sample efficient than these baselines. To explain this performance gap, we provide further analysis of a toy problem that highlights the coupling between a local maximum and poor exploration, and also visualize the differences between the learned models from AIL and LfGP.
|
| |
| 08:36-08:42, Paper MoAT17.2 | Add to My Program |
| Hierarchical Decision Transformer |
|
| Correia, Andr� | Universidade Da Beira Interior and NOVA LINCS |
| Alexandre, Lu�s A. | Univ. Beira Interior and NOVA LINCS |
Keywords: Imitation Learning, Deep Learning Methods, Machine Learning for Robot Control
Abstract: Sequence models in reinforcement learning require task knowledge to estimate the task policy. This paper presents the hierarchical decision transformer (HDT). HDT is a hierarchical behavior cloning algorithm that improves the performance of transformer methods in imitation learning, improving their robustness to tasks with longer episodes and/or sparse rewards, without requiring task knowledge or user interaction currently present in the state-of-the-art. The high-level mechanism guides the low-level controller through the task by selecting sub-goals for the latter to reach. This sequence replaces the returns-to-go of previous methods, improving its performance overall, especially in tasks with longer episodes and scarcer rewards. We validate our method in multiple tasks of OpenAI Gym, D4RL, and RoboMimic benchmarks. Our method outperforms the baselines in twenty three out of thirty one settings of varied horizons and reward frequencies without prior task knowledge, showing the advantages of the hierarchical model approach for learning from demonstrations using a sequence model. We also evaluate the method on a reaching task on a physical robot.
|
| |
| 08:42-08:48, Paper MoAT17.3 | Add to My Program |
| ProDMPs: A Unified Perspective on Dynamic and Probabilistic Movement Primitives |
|
| Li, Ge | Karlsruhe Institute of Technology (KIT) |
| Jin, Zeqi | Karlsruhe Institute of Technology |
| Volpp, Michael | Karlsruhe Institute of Technology |
| Otto, Fabian | Bosch Center for AI, University of Tuebingen |
| Lioutikov, Rudolf | Karlsruhe Institute of Technology |
| Neumann, Gerhard | Karlsruhe Institute of Technology |
Keywords: Imitation Learning, Machine Learning for Robot Control
Abstract: Abstract� Movement Primitives (MPs) are a well-known concept to represent and generate modular trajectories. MPs can be broadly categorized into two types: (a) dynamics-based approaches that generate smooth trajectories from any initial state, e. g., Dynamic Movement Primitives (DMPs), and (b) probabilistic approaches that capture higher-order statistics of the motion, e. g., Probabilistic Movement Primitives (ProMPs). To date, however, there is no MP method that unifies both, i. e. that can generate smooth trajectories from an arbitrary initial state while capturing higher-order statistics. In this paper, we introduce a unified perspective of both approaches by solving the ODE underlying the DMPs. We convert expensive online numerical integration of DMPs into position and velocity basis functions that can be used to represent trajectories or trajectory distributions similar to ProMPs while maintaining all the properties of dynamical systems. Since we inherit the properties of both methodologies, we call our proposed model Probabilistic Dynamic Movement Primitives (ProDMPs). Additionally, we embed ProDMPs in deep neural network architecture and propose a new cost function for efficient end-to-end learning of higher-order trajectory statistics. To this end, we leverage Bayesian Aggregation for nonlinear iterative conditioning on sensory inputs. Our proposed model achieves smooth trajectory generation, goal-attractor convergence, correlation analysis, non-linear conditioning, and online re-planing in one framework. Our code can be found in https://github.com/BruceGeLi/ProDMP RAL.
|
| |
| 08:48-08:54, Paper MoAT17.4 | Add to My Program |
| Imitation-Guided Multimodal Policy Generation from Behaviourally Diverse Demonstrations |
|
| Zhu, Shibei | Aalto University |
| Kaushik, Rituraj | Aalto University, Finland |
| Kaski, Samuel | Aalto University, University of Manchester |
| Kyrki, Ville | Aalto University |
Keywords: Evolutionary Robotics, Imitation Learning, Reinforcement Learning
Abstract: Learning policies from multiple demonstrators is often difficult because different individuals perform the same task differently due to hidden factors such as preferences. In the context of policy learning, this leads to multimodal policies. Existing policy learning methods often converge to a single solution mode, failing to capture the diversity in the solution space. In this paper, we introduce an imitation-guided reinforcement learning framework to solve the multimodal policy learning problem from a limited number of state-only demonstrations. Then, we propose LfBD (Learning from Behaviourally diverse Demonstration), an algorithm that builds a parameterised solution space to capture the variability in the behaviour space defined by demonstrations. To this end, we define a projection function based on the state density distributions from demonstrations to define such space. Our goal is not only to learn how to solve the task as the human demonstrator but also to extrapolate beyond the provided demonstrations. In addition, we show that with our method, we can perform a post-hoc policy search in the built solution space to recover policies that satisfy specific constraints or to find a policy that matches a given (state-only) behaviour.
|
| |
| 08:54-09:00, Paper MoAT17.5 | Add to My Program |
| Model-Based Adversarial Imitation Learning from Demonstrations and Human Reward |
|
| Huang, Jie | Ocean University of China |
| Hao, Jiangshan | Ocean University of China |
| Juan, Rongshun | Tianjin University |
| Gomez, Randy | Honda Research Institute Japan Co., Ltd |
| Nakamura, Keisuke | Honda Research Institute Japan Co., Ltd |
| Li, Guangliang | Ocean University of China |
Keywords: Imitation Learning, Human-Robot Collaboration, Reinforcement Learning
Abstract: Reinforcement learning (RL) can potentially be applied to real-world robot control in complex and uncertain environments. However, it is difficult or even unpractical to design an efficient reward function for various tasks, especially those large and high-dimensional environments. Generative adversarial imitation learning (GAIL) --- a general model-free imitation learning method, allows robots to directly learn policies from expert trajectories in large and high-dimensional environments. However, GAIL is still sample inefficient in terms of environmental interaction. In this paper, to solve this problem, we propose a model-based adversarial imitation learning from demonstrations and human reward (MAILDH), a novel model-based interactive imitation framework combining the advantages of GAIL, interactive RL and model-based RL. We tested our method in eight physics-based discrete and continuous control tasks for RL. Our results show that MAILDH can greatly improve the sample efficiency and robustness compared to the original GAIL.
|
| |
| 09:00-09:06, Paper MoAT17.6 | Add to My Program |
| Interpretable Motion Planner for Urban Driving Via Hierarchical Imitation Learning |
|
| Wang, Bikun | Horizon Robotics |
| Wang, Zhipeng | Horizon Robotics |
| Zhu, Chenhao | Horizon Robotics |
| Zhang, Zhiqiang | Horizon Robotics |
| Wang, Zhichen | Horizon Robotics |
| Lin, Penghong | Horizon Robotics |
| Liu, Jingchu | Horizon Robotics |
| Zhang, Qian | Horizon Robotics |
Keywords: Imitation Learning, Computer Vision for Automation, Task and Motion Planning
Abstract: Learning-based approaches have achieved remarkable performance in the domain of autonomous driving. Leveraging the impressive ability of neural networks and large amounts of human driving data, complex patterns and rules of driving behavior can be encoded as a model to benefit the autonomous driving system. Besides, an increasing number of data-driven works have been studied in the decision-making and motion planning module. However, the reliability and the stability of the neural network is still full of uncertainty. In this paper, we introduce a hierarchical planning architecture including a high-level grid-based behavior planner and a low-level trajectory planner, which is highly interpretable and controllable. As the high-level planner is responsible for finding a consistent route, the low-level planner generates a feasible trajectory. We evaluate our method both in closed-loop simulation and real world driving, and demonstrate the neural network planner has outstanding performance in complex urban autonomous driving scenarios
|
| |
| 09:06-09:12, Paper MoAT17.7 | Add to My Program |
| Hierarchical Imitation Learning for Stochastic Environments |
|
| Igl, Maximilian | Waymo LLC |
| Shah, Punit | Waymo |
| Mougin, Paul | Waymo |
| Srinivasan, Sirish | ETH Z�rich |
| Gupta, Tarun | University of Oxford |
| White, Brandyn | Waymo |
| Shiarlis, Kyriacos | Waymo |
| Whiteson, Shimon | Waymo |
Keywords: Imitation Learning, Representation Learning, Deep Learning Methods
Abstract: Many applications of imitation learning require the agent to generate the full distribution of observed behaviour in the training data. For example, to evaluate the safety of autonomous vehicles in simulation, accurate and diverse behaviour models of other road users are paramount. Existing methods that improve this distributional realism typically rely on hierarchical policies. These condition the policy on types such as goals or personas that give rise to the multi-modal behaviour. However, such methods are often inappropriate for stochastic environments where the agent must also react to external factors. Because agent types are inferred from the observed future trajectory during training, these environments require that the contributions of internal and external factors to the agent behaviour are disentangled and only internal factors that are under the agent's control are encoded in the type. Encoding future information about external factors leads to inappropriate agent reactions during testing, when the future is unknown and types must be drawn randomly.
|
| |
| 09:12-09:18, Paper MoAT17.8 | Add to My Program |
| Efficient Deep Learning of Robust, Adaptive Policies Using Tube MPC-Guided Data Augmentation |
|
| Zhao, Tong | Massachusetts Institute of Technology |
| Tagliabue, Andrea | Massachusetts Institute of Technology |
| How, Jonathan | Massachusetts Institute of Technology |
Keywords: Imitation Learning, Machine Learning for Robot Control, Robust/Adaptive Control
Abstract: The deployment of agile autonomous systems in challenging, unstructured environments requires adaptation capabilities and robustness to uncertainties. Existing robust and adaptive controllers, such as those based on model predictive control (MPC), can achieve impressive performance at the cost of heavy online onboard computations. Strategies that efficiently learn robust and onboard-deployable policies from MPC have emerged, but they still lack fundamental adaptation capabilities. In this work, we extend an existing efficient Imitation Learning (IL) algorithm for robust policy learning from MPC with the ability to learn policies that adapt to challenging model/environment uncertainties. The key idea of our approach consists in modifying the IL procedure by conditioning the policy on a learned lower-dimensional model/environment representation that can be efficiently estimated online. We tailor our approach to the task of learning an adaptive position and attitude control policy to track trajectories under challenging disturbances on a multirotor. Evaluations in simulation show that a high-quality adaptive policy can be obtained in about 1.3 hours. We additionally empirically demonstrate rapid adaptation to in- and out-of-training-distribution uncertainties, achieving a 6.1 cm average position error under wind disturbances that correspond to about 50% of the weight of the robot, and that are 36% larger than the maximum wind seen during training.
|
| |
| 09:18-09:24, Paper MoAT17.9 | Add to My Program |
| Masked Imitation Learning: Discovering Environment-Invariant Modalities in Multimodal Demonstrations |
|
| Hao, Yilun | Stanford University |
| Wang, Ruinan | Stanford University |
| Cao, Zhangjie | Stanford University |
| Wang, Zihan | Stanford University |
| Cui, Yuchen | Stanford University |
| Sadigh, Dorsa | Stanford University |
Keywords: Imitation Learning, Learning from Demonstration
Abstract: Multimodal demonstrations provide robots with an abundance of information to make sense of the world. However, such abundance may not always lead to good performance when it comes to learning sensorimotor control policies from human demonstrations. Extraneous data modalities can lead to state over-specification, where the state contains modalities that are not only useless for decision-making but also can change data distribution across environments. State over-specification leads to issues such as the learned policy not generalizing outside of the training data distribution. In this work, we propose Masked Imitation Learning (MIL) to address state over-specification by selectively using informative modalities. Specifically, we design a masked policy network with a binary mask to block certain modalities. We develop a bi-level optimization algorithm that learns this mask to accurately filter over-specified modalities. We demonstrate empirically that MIL outperforms baseline algorithms in simulated domains and effectively recovers the environment-invariant modalities on a multimodal dataset collected on a real robot. Videos and supplemental details are at: https://tinyurl.com/masked-il
|
| |
| 09:24-09:30, Paper MoAT17.10 | Add to My Program |
| Does Unpredictability Influence Driving Behavior? |
|
| Samavi, Sepehr | University of Toronto |
| Shkurti, Florian | University of Toronto |
| Schoellig, Angela P. | TU Munich |
Keywords: Human-Aware Motion Planning, Imitation Learning
Abstract: In this paper we investigate the effect of the unpredictability of surrounding cars on an ego-car performing a driving maneuver. We use Maximum Entropy Inverse Reinforcement Learning to model reward functions for an ego-car conducting a lane change in a highway setting. We define a new feature based on the unpredictability of surrounding cars and use it in the reward function. We learn two reward functions from human data: a baseline and one that incorporates our defined unpredictability feature, then compare their performance with a quantitative and qualitative evaluation. Our evaluation demonstrates that incorporating the unpredictability feature leads to a better fit of human-generated test data. These results encourage further investigation of the effect of unpredictability on driving behavior.
|
| |
| 09:30-09:36, Paper MoAT17.11 | Add to My Program |
| From Temporal-Evolving to Spatial-Fixing: A Keypoints-Based Learning Paradigm for Visual Robotic Manipulation |
|
| Riou, Kevin | Nantes University |
| Dong, Kaiwen | China University of Mining and Technology, Xuzhou, 221116, China |
| Subrin, K�vin | Universit� De Nantes / LS2N |
| Sun, Yanjing | School of Information and Control Engineering, China University |
| Le Callet, Patrick | Nante University |
Keywords: Imitation Learning, Representation Learning, Sensorimotor Learning
Abstract: The current learning pipelines for robotics ma- nipulation infer movement primitives sequentially along the temporal-evolving axis, which can result in an accumulation of prediction errors and subsequently cause the visual observa- tions to fall out of the training distribution. This paper proposes a novel hierarchical behavior cloning approach which tries to dissociate standard behaviour cloning (BC) pipeline to two stages. The intuition of this approach is to eliminate accumu- lation errors using a fixed spatial representation. At first stage, a high-level planner will be employed to translate the initial observation of the scene into task-specific spatial waypoints. Then, a low-level robotic path planner takes over the task of guiding the robot by executing a set of pre-defined elementary movements or actions known as primitives, with the goal of reaching the previously predicted waypoints. Our hierarchical keypoints-based paradigm aims to simplify existing temporal- evolving approach to a more simple way: directly spatialize the whole sequential primitives as a set of 8D waypoints only from the very first observation. Plentiful experiments demon- strate that our paradigm can achieve comparable results with Reinforcement Learning (RL) and outperforms existing offline BC approaches, with only a single-shot inference from the initial observation. Code and models are available at : https: //github.com/KevinRiou22/spatial-fixing-il
|
| |
| 09:36-09:42, Paper MoAT17.12 | Add to My Program |
| Disturbance Injection under Partial Automation: Robust Imitation Learning for Long-Horizon Tasks |
|
| Tahara, Hirotaka | NARA Institute of Science and Technology |
| Sasaki, Hikaru | Nara Institute of Science and Technology |
| Oh, Hanbit | Nara Institute of Science and Technology |
| Anarossi, Edgar | Nara Institute of Science and Technology |
| Matsubara, Takamitsu | Nara Institute of Science and Technology |
Keywords: Imitation Learning, Learning from Demonstration
Abstract: Partial Automation (PA) with intelligent support systems has been introduced in industrial machinery and advanced automobiles to reduce the burden of long hours of human operation. Under PA, operators perform manual operations (providing actions) and operations that switch to automatic/manual mode (mode-switching). Since PA reduces the total duration of manual operation, these two action and mode-switching operations can be replicated by imitation learning with high sample efficiency. To this end, this paper proposes Disturbance Injection under Partial Automation (DIPA) as a novel imitation learning framework. In DIPA, mode and actions (in the manual mode) are assumed to be observables in each state and are used to learn both action and mode-switching policies. The above learning is robustified by injecting disturbances into the operator's actions to optimize the disturbance's level for minimizing the covariate shift under PA. We experimentally validated the effectiveness of our method for long-horizon tasks in two simulations and a real robot environment and confirmed that our method outperformed the previous methods and reduced the demonstration burden.
|
| |
| 09:42-09:48, Paper MoAT17.13 | Add to My Program |
| Training Robots without Robots: Deep Imitation Learning for Master-To-Robot Policy Transfer |
|
| Kim, Heecheol | The University of Tokyo |
| Ohmura, Yoshiyuki | The University of Tokyo |
| Nagakubo, Akihiko | National Institute of Advanced IndustrialScienceandTechnology |
| Kuniyoshi, Yasuo | The University of Tokyo |
Keywords: Imitation Learning, Deep Learning in Grasping and Manipulation, Dual Arm Manipulation
Abstract: Deep imitation learning is promising for robot manipulation because it only requires demonstration samples. In this study, deep imitation learning is applied to tasks that require force feedback. However, existing demonstration methods have deficiencies; bilateral teleoperation requires a complex control scheme and is expensive, and kinesthetic teaching suffers from visual distractions from human intervention. This research proposes a new master-to-robot (M2R) policy transfer system that does not require robots to teach force feedback-based manipulation tasks to robots. The human directly demonstrates a task using a controller. This controller resembles the kinematic parameters of the robot arm and uses the same end-effector with force/torque (F/T) sensors to measure the force feedback. Using this controller, the operator can feel force feedback without a bilateral system. The proposed method can overcome domain gaps between the master and robot using gaze-based imitation learning and a simple calibration method. Furthermore, a Transformer is applied to infer policy from F/T sensory input. The proposed system was evaluated on a bottle-cap-opening task that requires force feedback.
|
| |
| 09:48-09:54, Paper MoAT17.14 | Add to My Program |
| Imitrob: Imitation Learning Dataset for Training and Evaluating 6D Object Pose Estimators |
|
| Sedlar, Jiri | Czech Technical University |
| Stepanova, Karla | Czech Technical University |
| Skoviera, Radoslav | Czech Institute of Informatics, Robotics, and Cybernetics; Czech |
| Behrens, Jan Kristof | Czech Technical University in Prague, CIIRC |
| Tuna, Mat�� | Comenius University in Bratislava |
| Sejnova, Gabriela | Czech Technical University in Prague |
| Sivic, Josef | Czech Technical University |
| Babuska, Robert | Delft University of Technology |
Keywords: Imitation Learning, Object Detection, Segmentation and Categorization, Computer Vision for Manufacturing
Abstract: This paper introduces a dataset for training and evaluating methods for 6D pose estimation of hand-held tools in task demonstrations captured by a standard RGB camera. Despite the significant progress of 6D pose estimation methods, their performance is usually limited for heavily occluded objects, which is a common case in imitation learning, where the object is typically partially occluded by the manipulating hand. Currently, there is a lack of datasets that would enable the development of robust 6D pose estimation methods for these conditions. To overcome this problem, we collect a new dataset (Imitrob) aimed at 6D pose estimation in imitation learning and other applications where a human holds a tool and performs a task. The dataset contains image sequences of nine different tools and twelve manipulation tasks with two camera viewpoints, four human subjects, and left/right hand. Each image is accompanied by an accurate ground truth measurement of the 6D object pose obtained by the HTC Vive motion tracking device. The use of the dataset is demonstrated by training and evaluating a recent 6D object pose estimation method (DOPE) in various setups. The dataset and code are publicly available at http://imitrob.ciirc.cvut.cz/imitrobdataset.php.
|
| |
| MoAT18 Regular session, 331ABC |
Add to My Program |
| Calibration and Identification |
|
| |
| Chair: Leutenegger, Stefan | Technical University of Munich |
| Co-Chair: Lee, Dongjun | Seoul National University |
| |
| 08:30-08:36, Paper MoAT18.1 | Add to My Program |
| Accurate and Interactive Visual-Inertial Sensor Calibration with Next-Best-View and Next-Best-Trajectory Suggestion |
|
| Choi, Christopher | Imperial College London |
| Xu, Binbin | University of Toronto |
| Leutenegger, Stefan | Technical University of Munich |
Keywords: Calibration and Identification, Visual-Inertial SLAM, SLAM
Abstract: Visual-Inertial (VI) sensors are popular in robotics, self-driving vehicles, and augmented and virtual reality applications. In order to use them for any computer vision or state-estimation task, a good calibration is essential. However, collecting informative calibration data in order to render the calibration parameters observable is not trivial for a non-expert. In this work, we introduce a novel VI calibration pipeline that guides a non-expert with the use of a graphical user interface and information theory in collecting informative calibration data with Next-Best-View and Next-Best-Trajectory suggestions to calibrate the intrinsics, extrinsics, and temporal misalignment of a VI sensor. We show through experiments that our method is faster, more accurate, and more consistent than state-of-the-art alternatives. Specifically, we show how calibrations with our proposed method achieve higher accuracy estimation results when used by state-of-the-art VI Odometry as well as VI-SLAM approaches.
|
| |
| 08:36-08:42, Paper MoAT18.2 | Add to My Program |
| A ROS-Based Kinematic Calibration Tool for Serial Robots |
|
| Pascal, Caroline | ENSTA Paris |
| Doar�, Olivier | UME ENSTA Paris |
| Chapoutot, Alexandre | ENSTA Paris |
Keywords: Calibration and Identification, Software-Hardware Integration for Robot Systems, Kinematics
Abstract: The use of serial robots for industrial and research purposes is often limited by a flawed positioning accuracy, caused by the differences between the robot nominal model, and the real one. Such an issue can be solved by means of kinematic calibration, which is usually a tedious and intricate task. In this paper, we propose a complete kinematic calibration procedure relying on established geometric modeling, measurements design and parameters identification methods, as well as multiple integration tools, to provide a high adaptability and a simplified handling. The overall process was bundled up in a ROS-based modular and user-friendly package, whose main objective is to offer a smooth and fully integrated framework for the kinematic calibration of serial robots. Our solution was successfully tested using a motion tracking device, and allowed to increase the overall positioning accuracy of two different serial robots by 75% in a matter of hours.
|
| |
| 08:42-08:48, Paper MoAT18.3 | Add to My Program |
| FUSE-D: Framework for UAV System-Parameter Estimation with Disturbance Detection |
|
| B�hm, Christoph | University Klagenfurt |
| Weiss, Stephan | Universit�t Klagenfurt |
Keywords: Calibration and Identification, Force and Tactile Sensing, Autonomous Vehicle Navigation
Abstract: Modern unmanned aerial vehicles (UAVs) with sophisticated mechanics ask for extended online system identification to aid model-based controls in task execution. In addition, UAVs in adverse environmental conditions require a more detailed environmental disturbance understanding. The necessary combination of online system identification, sensor suite self-calibration, and external disturbance analysis to tackle these issues holistically is currently an open issue. Our proposed FUSE-D approach combines these elements based on a system model at the rotor-speed level and a single global pose sensor (e.g., a tracking system like Optitrack). Besides sensor intrinsics and extrinsics, the framework allows estimating the UAV�s rotor geometry, mass, moments of inertia, and the rotors� aerodynamic properties, as well as an external force and where it acts on the UAV. The general formulation allows us to extend the approach to an N-rotor (multi-rotor) UAV and classify the type of external disturbance. We perform a detailed non-linear observability analysis for the 43 + 7N states and do a statistically relevant embedded hardware-in-the-loop performance analysis in the realistic simulation environment Gazebo with RotorS.
|
| |
| 08:48-08:54, Paper MoAT18.4 | Add to My Program |
| Multiplanar Self-Calibration for Mobile Cobot 3D Object Manipulation Using 2D Detectors and Depth Estimation |
|
| Dang, Tuan | University Taxes at Arlington |
| Nguyen, Khang | University of Texas at Arlington |
| Huber, Manfred | University of Texas at Arlington |
Keywords: AI-Enabled Robotics, Human-Robot Collaboration, Software Architecture for Robotic and Automation
Abstract: Calibration is the first and foremost step in dealing with sensor displacement errors that can appear during extended operation and off-time periods to enable robot object manipulation with precision. In this paper, we present a novel multiplanar self-calibration between the camera system and the robot's end-effector for 3D object manipulation. Our approach first takes the robot end-effector as ground truth to calibrate the camera's position and orientation while the robot arm moves the object in multiple planes in 3D space, and a 2D state-of-the-art vision detector identifies the object's center in the image coordinates system. The transformation between world coordinates and image coordinates is then computed using 2D pixels from the detector and 3D known points obtained by robot kinematics. Next, an integrated stereo-vision system estimates the distance between the camera and the object, resulting in 3D object localization. We test our proposed method on the Baxter robot with two 7-DOF arms and a 2D detector that can run in real time on an onboard GPU. After self-calibrating, our robot can localize objects in 3D using an RGB camera and depth image. The source code is available at https://github.com/tuantdang/calib_cobot.
|
| |
| 08:54-09:00, Paper MoAT18.5 | Add to My Program |
| Labelling Lightweight Robot Energy Consumption: A Mechatronics-Based Benchmarking Metric Set |
|
| Heredia, Juan | University of Southern Denmark |
| Kirschner, Robin Jeanne | TU Munich, Institute for Robotics and Systems Intelligence |
| Abdolshah, Saeed | Technical University of Munich |
| Schlette, Christian | University of Southern Denmark (SDU) |
| Haddadin, Sami | Technical University of Munich |
| Mikkel, Kj�rgaard | University of Southern Denmark |
Keywords: Performance Evaluation and Benchmarking, Energy and Environment-Aware Automation, Actuation and Joint Mechanisms
Abstract: Compliance with global guidelines for sustainable and responsible production in modern industry requires a comparative analysis of consumer devices' energy consumption (EC). This also holds true for the newly established generation of lightweight industrial robots (LIRs). To identify potential strategies for energy optimization, standardized benchmarking procedures are required. However, to the best of the authors' knowledge, there is currently no standardized method for benchmarking the EC of manipulators. In response to this need, we have developed a comprehensive benchmarking framework to evaluate the EC of various LIR designs, delving into the theoretical power consumption under both static and dynamic conditions. Our analysis has led to the proposal of seven proposed metrics�three static and four dynamic. The static metrics�controller consumption, joint electronics consumption, and mechanical brakes' consumption�evaluate the maintenance EC of the robot. Meanwhile, we suggest three dynamic metrics that gauge the system�s energy efficiency during motion, with or without payload. We extend this metrics selection by introducing the cost of transportation map for manipulators. For each of the metrics, we suggest a standardized measurement procedure based on state-of-the-art norms and literature. The metric set and experimental procedures are demonstrated using five manipulators (UR3e, UR5e, FR3, M0609, Gen3). Among the results, we can see interesting trends for future optimization of the electronic components and their architecture, e.g., reducing the robot's EC by decentralizing computation via low-consumption onboard controllers for basic tasks and external servers for complex ones.
|
| |
| 09:00-09:06, Paper MoAT18.6 | Add to My Program |
| The Role of Absolute Positioning Error in Hand-Eye Calibration and Robotic Guidance Systems: An Analysis |
|
| Chalus, Michal | University of West Bohemia |
| Vanicek, Ondrej | University of West Bohemia |
| Liska, Jindrich | University of West Bohemia |
Keywords: Calibration and Identification, Computer Vision for Manufacturing, Industrial Robots
Abstract: Robotic manipulators deal with serious issues due to their absolute positioning error. This error is usually compensated by an operator in classical robot programming using the teach-and-play method. However, it has a significant effect on accuracy of robotic guidance systems (RGS) that automatically generate process tool trajectory based on the measured data from a sensor. In this paper, we firstly describe the various components of an RGS that affect its overall accuracy. We then introduce a proposed model for the calibration process (MCP) that can be used to analyze the effect of absolute positioning errors on the accuracy of hand-eye calibration, six-point calibration of a process tool and mutual transformation between these tools. Simulations were used to evaluate the proposed MCP model. The results of this analysis are crucial for the practical use of RGS.
|
| |
| 09:06-09:12, Paper MoAT18.7 | Add to My Program |
| Robotic Kinematic Calibration with Only Position Data and Consideration of Non-Geometric Errors Using POE-Based Model and Gaussian Mixture Models |
|
| Luo, Xiao | The Chinese University of Hong Kong |
| Xian, Yitian | The Chinese University of Hong Kong |
| Lei, Man Cheong | The Chinese University of Hong Kong |
| Li, Jian | The Chinese University of Hong Kong |
| Xie, Ke | The Chinese University of Hong Kong |
| Zou, Limin | The Chinese University of Hong Kong |
| Li, Zheng | The Chinese University of Hong Kong |
Keywords: Calibration and Identification, Kinematics, Probability and Statistical Methods
Abstract: Kinematic calibration is crucial to improve the positioning accuracy of serial robots. This paper proposes a novel algorithm for robotic kinematic calibration based on an augmented product of exponentials (POE)-based kinematic model using Gaussian mixture models (GMMs) with only position data. In this algorithm, non-geometric errors that cannot be fitted by varying the parameters within the traditional robot model are also considered and compensated. This approach involving a three-stage calibration process which is used to identify the kinematic model parameters and to train the GMMs will be presented in this paper. Finally, this algorithm will be applied to two serial robots for simulation and experimental validation. The effectiveness of the proposed algorithm is verified from both results and significant improvement on error reduction from 26 % to 96% can be observed through the comparison with other existing approaches.
|
| |
| 09:12-09:18, Paper MoAT18.8 | Add to My Program |
| MOISST: Multimodal Optimization of Implicit Scene for SpatioTemporal Calibration |
|
| Herau, Quentin | Huawei, University of Burgundy |
| Piasco, Nathan | Huawei Technologies France |
| Bennehar, Moussab | Lirmm - Umr 5506 |
| Roldao, Luis | Huawei |
| Tsishkou, Dzmitry | Huawei Technologies |
| Migniot, Cyrille | U Bourgogne |
| Vasseur, Pascal | Universit� De Picardie Jules Verne |
| Demonceaux, C�dric | Universit� De Bourgogne |
Keywords: Sensor Fusion, Calibration and Identification, Computer Vision for Transportation
Abstract: With the recent advances in autonomous driving and the decreasing cost of LiDARs, the use of multimodal sensor systems is on the rise. However, in order to make use of the information provided by a variety of complimentary sensors, it is necessary to accurately calibrate them. We take advantage of recent advances in computer graphics and implicit volumetric scene representation to tackle the problem of multi-sensor spatial and temporal calibration. Thanks to a new formulation of the Neural Radiance Field (NeRF) optimization, we are able to jointly optimize calibration parameters along with scene representation based on radiometric and geometric measurements. Our method enables accurate and robust calibration from data captured in uncontrolled and unstructured urban environments, making our solution more scalable than existing calibration solutions. We demonstrate the accuracy and robustness of our method in urban scenes typically encountered in autonomous driving scenarios.
|
| |
| 09:18-09:24, Paper MoAT18.9 | Add to My Program |
| Automatic Spatial Radar Camera Calibration Via Geometric Constraints with Doppler-Optical Flow Fusion |
|
| Ge, Jintian | Nanyang Technological University |
| Yanxin, Zhou | Nanyang Technological University |
| Lou, Baichuan | Nanyang Technological University |
| Lv, Chen | Nanyang Technological University |
Keywords: Calibration and Identification, Sensor Fusion, Computer Vision for Automation
Abstract: Many intelligent robots use a combination of radar and camera sensors to capture environmental information. Robust and accurate perception highly relies on the result of multi-sensor calibration. Most current spatial calibration methods require a calibration board or a special marker as the target. In this paper, we provide a novel calibration method for RGBD camera and millimeter-wave radar, which automatically estimates the extrinsic parameters. Our proposed method includes the following two stages: rough extrinsic parameters are estimated by using object contours as geometric constraints, and meanwhile, the optimum is reached via optimizing based on the difference of velocity obtained from camera and radar. It only needs an object moving past sensors, but does not require for a calibration board. We validate our method through simulation experiments and real-world experiments. We construct a simulation environment in CARLA to verify the performance of our proposed method against different angles. Furthermore, different levels of zero mean Gaussian noise are added to evaluate the stability of our method. In addition, real-world experiments with different hardware setups are taken to verify the feasibility of our method in real-world conditions.
|
| |
| 09:24-09:30, Paper MoAT18.10 | Add to My Program |
| Extrinsic Calibration of Camera to LIDAR Using a Differentiable Checkerboard Model |
|
| Fu, Lanke Frank Tarimo | University of Oxford |
| Chebrolu, Nived | University of Oxford |
| Fallon, Maurice | University of Oxford |
Keywords: Calibration and Identification
Abstract: Multi-modal sensing often involves determining correspondences between each domain�s signals, which in turn depends on the accurate extrinsic calibration of the sensors. Challengingly, the camera-LIDAR sensor modalities are quite dissimilar and the narrow field of view of most commercial LIDARs means that they observe only a partial view of the camera frustum. We present a framework for extrinsic calibration of a camera and a LIDAR using only a simple off-the-shelf checkerboard. It is designed to operate even when the LIDAR observes a significantly truncated portion of the checkerboard. Current state-of-the-art methods often require bespoke manufactured markers or full observation of the entire checkerboard in both camera and LIDAR data which is prohibitive. By contrast, our novel algorithm directly aligns the LIDAR intensity pattern to the camera-detected checkerboard pattern using our differentiable formulation. The key step for achieving accurate extrinsics estimation is the use of the spatial derivatives provided by the differentiable checkerboard pattern, and jointly optimizing over all views. In our experiments, we achieve calibration accuracy in the order of 2-4 mm and demonstrate a 30% error reduction compared to state-of-the-art approaches. We are able to achieve this improvement while using only partial LIDAR views of the checkerboard which allows for a simpler data capture process. We also demonstrate the generalizability of our approach to different combinations of LIDARs and cameras with varying sparsity patterns and noise levels.
|
| |
| 09:30-09:36, Paper MoAT18.11 | Add to My Program |
| Graph-Based Visual-Kinematic Fusion and Monte Carlo Initialization for Fast-Deployable Cable-Driven Robots |
|
| Khorrambakht, Rooholla | New York University |
| Damirchi, Hamed | University of Adelaide |
| Dindarloo, Mohammad Reza | K. N. Toosi University of Technology |
| Saki, Aria | K.N Toosi University of Tehcnology |
| Khalilpour, S. Ahmad | K. N. Toosi University of Technology |
| Taghirad, Hamid | K. N. Toosi University of Technology |
| Weiss, Stephan | Universit�t Klagenfurt |
Keywords: Parallel Robots, Calibration and Identification, Sensor Fusion
Abstract: Ease of calibration and high-accuracy task-space state-estimation purely based on onboard sensors is a key requirement for enabling easily deployable cable robots in real-world applications. In this work, we incorporate the onboard camera and kinematic sensors to drive a statistical fusion framework that presents a unified localization and calibration system which requires no initial values for the kinematic parameters. This is achieved by formulating a Monte-Carlo algorithm that initializes a factor-graph representation of the calibration and localization problem. With this, we are able to jointly identify both the kinematic parameters and the visual odometry scale alongside their corresponding uncertainties. We demonstrate the practical applicability of the framework using our state-estimation dataset recorded with the ARAS-CAM suspended cable driven parallel robot, and published as part of this manuscript.
|
| |
| 09:36-09:42, Paper MoAT18.12 | Add to My Program |
| P2O-Calib: Camera-LiDAR Calibration Using Point-Pair Spatial Occlusion Relationship |
|
| Wang, Su | Robert Bosch |
| Zhang, Shini | Nanyang Technological University, Singapore |
| Qiu, Xuchong | Bosch |
Keywords: Calibration and Identification, Sensor Fusion, Deep Learning Methods
Abstract: The accurate and robust calibration result of sensors is considered as an important building block to the follow-up research in the autonomous driving and robotics domain. The current works involving extrinsic calibration between 3D LiDARs and monocular cameras mainly focus on target-based and target-less methods. The target-based methods are often utilized offline because of restrictions, such as additional target design and target placement limits. The current target-less methods suffer from feature indeterminacy and feature mismatching in various environments. To alleviate these limitations, we propose a novel target-less calibration approach that is based on the 2D-3D edge point extraction using the occlusion relationship in 3D space. Based on the extracted 2D-3D point pairs, we further propose an occlusion-guided point-matching method that improves the calibration accuracy and reduces computation costs. To validate the effectiveness of our approach, we evaluate the method performance qualitatively and quantitatively on real images from the KITTI dataset. The results demonstrate that our method outperforms the existing target-less methods and achieves low error and high robustness that can contribute to the practical applications relying on high-quality Camera-LiDAR calibration.
|
| |
| 09:42-09:48, Paper MoAT18.13 | Add to My Program |
| Wrench Estimation of Modular Manipulator with External Actuation and Joint Locking |
|
| Kim, Yonghyeok | Seoul National University |
| Lee, Hasun | Seoul National University |
| Lee, Jeongseob | Seoul National University |
| Lee, Dongjun | Seoul National University |
Keywords: Aerial Systems: Mechanics and Control, Distributed Robot Systems, Force Control
Abstract: This paper proposes an external wrench estimation method for modular manipulator, where each link module is driven with external actuation (e.g., rotors, thrusters) and inter-module joints can be locked to increase end-effector stiffness or workforce of the manipulator. For such systems, the commonly-used momentum-based observer (MBO) is not suitable due to the presence of unknown joint locking (JL) torque and also the degeneracy of Jacobian transpose relation with the system degree-of-freedom (DOF) becoming less than six with the joint locking. To overcome this, we propose two novel external wrench estimation algorithms: distributed algorithm based on recursive Newton-Euler dynamics and centralized algorithm based on D'Alembert's principle, both using an F/T (force/torque) sensor at the base. Experiments are conducted to demonstrate the effectiveness of the proposed algorithms.
|
| |
| 09:48-09:54, Paper MoAT18.14 | Add to My Program |
| Observability-Aware Online Multi-Lidar Extrinsic Calibration |
|
| Das, Sandipan | KTH |
| af Klinteberg, Ludvig | Scania |
| Fallon, Maurice | University of Oxford |
| Chatterjee, Saikat | KTH Royal Institute of Technology |
Keywords: Calibration and Identification, Intelligent Transportation Systems, Localization
Abstract: Accurate and robust extrinsic calibration is necessary for deploying autonomous systems which need multiple sensors for perception. In this paper, we present a robust system for real-time extrinsic calibration of multiple lidars in vehicle base frame without the need for any fiducial markers or features. We base our approach on matching absolute GNSS and estimated lidar poses in real-time. Comparing rotation components allows us to improve the robustness of the solution than traditional least-square approach comparing translation components only. Additionally, instead of comparing all corresponding poses, we select poses comprising maximum mutual information based on our novel observability criteria. This allows us to identify a subset of the poses helpful for real-time calibration. We also provide stopping criteria for ensuring calibration completion. To validate our approach extensive tests were carried out on data collected using Scania test vehicles (7 sequences for a total of ~ 6.5 Km). The results presented in this paper show that our approach is able to accurately determine the extrinsic calibration for various combinations of sensor setups.
|
| |
| MoAT19 Regular session, 360 Ambassador Ballroom |
Add to My Program |
| Deep Learning Methods I |
|
| |
| Chair: Arnold, Solvi | Shinshu Univeristy |
| Co-Chair: Ben Amor, Heni | Arizona State University |
| |
| 08:30-08:36, Paper MoAT19.1 | Add to My Program |
| Recognising Affordances in Predicted Futures to Plan with Consideration of Non-Canonical Affordance Effects |
|
| Arnold, Solvi | Shinshu Univeristy |
| Kuroishi, Mami | EPSON AVASYS |
| Karashima, Rin | EPSON AVASYS |
| Adachi, Tadashi | EPSON AVASYS |
| Yamazaki, Kimitoshi | Shinshu University |
Keywords: Deep Learning Methods, Task and Motion Planning, Neurorobotics
Abstract: We propose a novel system for action sequence planning based on a combination of affordance recognition and a neural forward model predicting the effects of affordance execution. By performing affordance recognition on predicted futures, we avoid reliance on explicit affordance effect definitions for multi-step planning. Because the system learns affordance effects from experience data, the system can foresee not just the canonical effects of an affordance, but also situation-specific side-effects. This allows the system to avoid planning failures due to such non-canonical effects, and makes it possible to exploit non-canonical effects for realising a given goal. We evaluate the system in simulation, on a set of test tasks that require consideration of canonical and non-canonical affordance effects.
|
| |
| 08:36-08:42, Paper MoAT19.2 | Add to My Program |
| AV-PedAware: Self-Supervised Audio-Visual Fusion for Dynamic Pedestrian Awareness |
|
| Yang, Yizhuo | Nangyang Technological Univercity |
| Yuan, Shenghai | Nanyang Technological University |
| Cao, Muqing | Nanyang Technological University |
| Yang, Jianfei | Nanyang Technological University |
| Xie, Lihua | NanyangTechnological University |
Keywords: Deep Learning Methods, Sensor Fusion, Human Detection and Tracking
Abstract: In this study, we introduce AV-PedAware, a self-supervised audio-visual fusion system designed to improve dynamic pedestrian awareness for robotics applications. Pedestrian awareness is a critical requirement in many robotics applications. However, traditional approaches that rely on cameras and LIDARs to cover multiple views can be expensive and susceptible to issues such as changes in illumination, occlusion, and weather conditions. Our proposed solution replicates human perception for 3D pedestrian detection using low-cost audio and visual fusion. This study represents the first attempt to employ audio-visual fusion to monitor footstep sounds for the purpose of predicting the movements of pedestrians in the vicinity. The system is trained through self-supervised learning based on LIDAR-generated labels, making it a cost-effective alternative to LIDAR-based pedestrian awareness. AV-PedAware achieves comparable results to LIDAR-based systems at a fraction of the cost. By utilizing an attention mechanism, it can handle dynamic lighting and occlusions, overcoming the limitations of traditional LIDAR and camera-based systems. To evaluate our approach's effectiveness, we collected a new multimodal pedestrian detection dataset and conducted experiments that demonstrate the system's ability to provide reliable 3D detection results using only audio and visual data, even in extreme visual conditions. We will make our collected dataset and source code available online for the community to encourage further development in the field of robotics perception systems.
|
| |
| 08:42-08:48, Paper MoAT19.3 | Add to My Program |
| A Multitask and Kernel Approach for Learning to Push Objects with a Target-Parameterized Deep Q-Network |
|
| Ewerton, Marco | Idiap Research Institute |
| Villamizar, Michael | IDIAP |
| Jankowski, Julius | Idiap Research Institute and EPFL |
| Calinon, Sylvain | Idiap Research Institute |
| Odobez, Jean-Marc | IDIAP |
Keywords: Deep Learning Methods, Deep Learning for Visual Perception, Perception for Grasping and Manipulation
Abstract: Pushing is an essential motor skill involved in several manipulation tasks, and has been an important research topic in robotics. Recent works have shown that Deep Q-Networks (DQNs) can learn pushing policies (when, where to push, and how) to solve manipulation tasks, potentially in synergy with other skills (e.g. grasping). Nevertheless, DQNs often assume a fixed setting and task, which may limit their deployment in practice. Furthermore, they suffer from sparse-gradient backpropagation when the action space is very large, a problem exacerbated by the fact that they are trained to predict state-action values based on a single reward function aggregating several facets of the task, rendering the model training challenging. To address these issues, we propose a multi-head target-parameterized DQN to learn robotic manipulation tasks, in particular pushing policies, and make the following contributions: i) we show that learning to predict different reward and task aspects can be beneficial compared to predicting a single value function where reward factors are not disentangled; ii) we study several alternatives to generalize a policy by encoding the target parameters either into the network layers or visually in the input; iii) we propose a kernelized version of the loss function, allowing to obtain better, faster and more stable training performance. Extensive experiments on simulations validate our design choices, and we show that our architecture learned on simulated data can achieve high performance in a real-robot setup involving a Franka Emika robot arm and unseen objects.
|
| |
| 08:48-08:54, Paper MoAT19.4 | Add to My Program |
| DRKF: Distilled Rotated Kernel Fusion for Efficient Rotation Invariant Descriptors in Local Feature Matching |
|
| Huang, Ranran | Meituan |
| Cai, Jiancheng | Meituan |
| Li, Chao | Beijing University of Posts and Telecommunications |
| Wu, Zhuoyuan | Meituan |
| Liu, Xinmin | Meituan |
| Chai, Zhenhua | Meituan |
Keywords: Deep Learning Methods, Visual Learning, Deep Learning for Visual Perception
Abstract: The performance of local feature descriptors degrades in the presence of large rotation variations. To address this issue, we present an efficient approach to learning rotation invariant descriptors. Specifically, we propose Rotated Kernel Fusion (RKF) which imposes rotations on the convolution kernel to improve the inherent nature of CNN. Since RKF can be processed by the subsequent re-parameterization, no extra computational costs will be introduced in the inference stage. Moreover, we present Multi-oriented Feature Aggregation (MOFA) which aggregates features extracted from multiple rotated versions of the input image and can provide auxiliary knowledge for the training of RKF by leveraging the distillation strategy. We refer to the distilled RKF model as DRKF. Besides the evaluation on a rotation-augmented version of the public dataset HPatches, we also contribute a new dataset named DiverseBEV which is collected during the drone�s flight and consists of bird�s eye view images with large viewpoint changes and camera rotations. Extensive experiments show that our method can outperform other state-of-the-art techniques when exposed to large rotation variations.
|
| |
| 08:54-09:00, Paper MoAT19.5 | Add to My Program |
| Efficient Q-Learning Over Visit Frequency Maps for Multi-Agent Exploration of Unknown Environments |
|
| Chen, Xuyang | Cognitive Robot Autonomy and Learning Lab |
| Iyer, Ashvin | Purdue University |
| Wang, Zixing | Purdue University |
| Qureshi, Ahmed H. | Purdue University |
Keywords: Deep Learning Methods, Reinforcement Learning, Multi-Robot Systems
Abstract: The robot exploration task has been widely studied with applications spanning from novel environment mapping to item delivery. For some time-critical tasks, such as rescue catastrophes, the agent is required to explore as efficiently as possible. Recently, Visit Frequency-based map representation achieved great success in such scenarios by discouraging repetitive visits with a frequency-based penalty. However, its relatively large size and single-agent settings hinder its further development. In this context, we propose Integrated Visit Frequency Map, which encodes identical information as Visit Frequency Map with a more compact size, and a visit frequency-based multi-agent information exchange and control scheme that is able to accommodate both representations. Through tests in diverse settings, the results indicate our proposed methods can achieve a comparable level of performance of VFM with lower bandwidth requirements and generalize well to different multi-agent setups including real-world environments.
|
| |
| 09:00-09:06, Paper MoAT19.6 | Add to My Program |
| Real-Time Trajectory-Based Social Group Detection |
|
| Jahangard, Simindokht | Monash University |
| Hayat, Munawar | Monash University |
| Rezatofighi, Hamid | Monash University |
Keywords: Deep Learning Methods, Human and Humanoid Motion Analysis and Synthesis, Human Detection and Tracking
Abstract: Social group detection is a crucial aspect of various robotic applications, including robot navigation and human-robot interactions. To date, a range of model-based techniques have been employed to address this challenge, such as the F-formation and trajectory similarity frameworks. However, these approaches often fail to provide reliable results in crowded and dynamic scenarios. Recent advancements in this area have mainly focused on learning-based methods, such as deep neural networks that use visual content or human pose. Although visual content based methods have demonstrated promising performance on large-scale datasets, their computational complexity poses a significant barrier to their practical use in real-time applications. To address these issues, we propose a simple and efficient framework for social group detection. Our approach explores the impact of motion trajectory on social grouping and utilizes a novel, reliable, and fast data-driven method. We formulate the individuals in a scene as a graph, where the nodes are represented by LSTM-encoded trajectories and the edges are defined by the distances between each pair of tracks. Our framework employs a modified graph transformer module and graph clustering losses to detect social groups. Our experiments on the popular JRDB-Act dataset reveal noticeable improvements in performance, with relative improvements ranging from 2% to 11%. Furthermore, our framework is significantly faster, with up to 12x faster inference times compared to state-of-the-art methods under the same computation resources. These results demonstrate that our proposed method is suitable for real-time robotic applications.
|
| |
| 09:06-09:12, Paper MoAT19.7 | Add to My Program |
| Point2Point : A Framework for Efficient Deep Learning on Hilbert Sorted Point Clouds with Applications in Spatio-Temporal Occupancy Prediction |
|
| Pandhare, Athrva Atul | University of Pennsylvania |
Keywords: Deep Learning Methods, Deep Learning for Visual Perception, Mapping
Abstract: The irregularity and permutation invariance of point cloud data pose challenges for effective learning. Conventional methods for addressing this issue involve converting raw point clouds to intermediate representations such as 3D voxel grids or range images. While such intermediate representations solve the problem of permutation invariance, they can result in significant loss of information. Approaches that do learn on raw point clouds either have trouble in resolving neighborhood relationships between points or are too complicated in their formulation. In this paper, we propose a novel approach to representing point clouds as a locality preserving 1D ordering induced by the Hilbert space-filling curve. We also introduce Point2Point, a neural architecture that can effectively learn on Hilbert-sorted point clouds. We show that Point2Point shows competitive performance on point cloud segmentation and generation tasks. Finally, we show the performance of Point2Point on Spatio-temporal Occupancy prediction from Point clouds.
|
| |
| 09:12-09:18, Paper MoAT19.8 | Add to My Program |
| Motion Planning Diffusion: Learning and Planning of Robot Motions with Diffusion Models |
|
| Mueller Carvalho, Joao Andre | Technische Universit�t Darmstadt |
| Le, An Thai | Technische Universit�t Darmstadt |
| Baierl, Mark | Technical University of Darmstadt |
| Koert, Dorothea | Technische Universitaet Darmstadt |
| Peters, Jan | Technische Universit�t Darmstadt |
Keywords: Deep Learning Methods, Learning from Experience
Abstract: Learning priors on trajectory distributions can help accelerate robot motion planning optimization. Given previously successful plans, learning trajectory generative models as priors for a new planning problem is highly desirable. Prior works propose several ways on utilizing this prior to bootstrapping the motion planning problem. Either sampling the prior for initializations or using the prior distribution in a maximum-a-posterior formulation for trajectory optimization. In this work, we propose learning diffusion models as priors. We then can sample directly from the posterior trajectory distribution conditioned on task goals, by leveraging the inverse denoising process of diffusion models. Furthermore, diffusion has been recently shown to effectively encode data multimodality in high-dimensional settings, which is particularly well-suited for large trajectory dataset. To demonstrate our method efficacy, we compare our proposed method - Motion Planning Diffusion - against several baselines in simulated planar robot and 7-dof robot arm manipulator environments. To assess the generalization capabilities of our method, we test it in environments with previously unseen obstacles. Our experiments show that diffusion models are strong priors to encode high-dimensional trajectory distributions of robot motions.
|
| |
| 09:18-09:24, Paper MoAT19.9 | Add to My Program |
| Active Task Randomization: Learning Robust Skills Via Unsupervised Generation of Diverse and Feasible Tasks |
|
| Fang, Kuan | University of California, Berkeley |
| Migimatsu, Toki | Stanford University |
| Mandlekar, Ajay Uday | NVIDIA |
| Fei-Fei, Li | Stanford University |
| Bohg, Jeannette | Stanford University |
Keywords: Deep Learning Methods, Deep Learning in Grasping and Manipulation, Representation Learning
Abstract: Solving real-world manipulation tasks requires robots to be equipped with a repertoire of skills that can be applied to diverse scenarios. While learning-based methods can enable robots to acquire skills from interaction data, their success relies on collecting training data that covers the diverse range of tasks that the robot may encounter during the test time. However, creating diverse and feasible training tasks often requires extensive domain knowledge and non-trivial manual labor. We introduce Active Task Randomization (ATR), an approach that learns robust skills through the unsupervised generation of training tasks. ATR selects suitable training tasks�which consist of an environment configuration and manipulation goal�by actively balancing their diversity and feasibility. In doing so, ATR effectively creates a curriculum that gradually increases task diversity while maintaining a moderate level of feasibility, which leads to more complex tasks as the skills become more capable. ATR predicts task diversity and feasibility with a compact task representation that is learned concurrently with the skills. The selected tasks are then procedurally generated in simulation with a graph-based parameterization. We demonstrate that the learned skills can be composed by a task planner to solve unseen sequential manipulation problems based on visual inputs. Compared to baseline methods, ATR can achieve superior success rates in single-step and sequential manipulation tasks.
|
| |
| 09:24-09:30, Paper MoAT19.10 | Add to My Program |
| Robust Self-Supervised Extrinsic Self-Calibration |
|
| Kanai, Takayuki | Toyota Research Institute |
| Vasiljevic, Igor | Toyota Research Institute |
| Guizilini, Vitor | Toyota Research Institute |
| Gaidon, Adrien | Toyota Research Institute |
| Ambrus, Rares | Toyota Research Institute |
Keywords: Deep Learning Methods, Calibration and Identification
Abstract: Autonomous vehicles and robots need to operate over a wide variety of scenarios in order to complete tasks efficiently and safely. Multi-camera self-supervised monocular depth estimation from videos is a promising way to reason about the environment, as it generates metrically scaled geometric predictions from visual data without requiring additional sensors. However, most works assume well-calibrated extrinsics to fully leverage this multi-camera setup, even though accurate and efficient calibration is still a challenging problem. In this work, we introduce a novel method for extrinsic calibration that builds upon the principles of self-supervised monocular depth and ego-motion learning. Our proposed curriculum learning strategy uses monocular depth and pose estimators with velocity supervision to estimate extrinsics, and then jointly learns extrinsic calibration along with depth and pose for a set of overlapping cameras rigidly attached to a moving vehicle. Experiments on a benchmark multi-camera dataset (DDAD) demonstrate that our method enables self-calibration in various scenes robustly and efficiently compared to a traditional vision-based pose estimation pipeline. Furthermore, we demonstrate the benefits of extrinsics self-calibration as a way to improve depth prediction via joint optimization. Project page: https://sites.google.com/tri.global/tri-sesc
|
| |
| 09:30-09:36, Paper MoAT19.11 | Add to My Program |
| Do More with Less: Single-Model, Multi-Goal Architectures for Resource-Constrained Robots |
|
| Wang, Zili | Boston University |
| Threatt, Drew | Boston University |
| Andersson, Sean | Boston University |
| Tron, Roberto | Boston University |
Keywords: Deep Learning Methods, Autonomous Agents
Abstract: Deep learning methods are widely used in robotic applications. By learning from prior experience, the robot can abstract knowledge of the environment, and use this knowledge to accomplish different goals, such as object search, frontier exploration, or scene understanding, with a smaller amount of resources than might be needed without that knowledge. Most existing methods typically require a significant amount of sensing, which in turn has significant costs in terms of power consumption for acquisition and processing, and typically focus on models that are tuned for each specific goal, leading to the need to train, store and run each one separately. These issues are particularly important in a resource-constrained setting, such as with small-scale robots or during long-duration missions. We propose a single, multi-task deep learning architecture that takes advantage of the structure of the partial environment to predict different abstractions of the environment (thus reducing the need for rich sensing), and to leverage these predictions to simultaneously achieve different high-level goals (thus sharing computation between goals). As an example application of the proposed architecture, we consider the specific example of a robot equipped with a 2-D laser scanner and an object detector, tasked with searching for an object (such as an exit) in a residential building while constructing a topological map that can be used for future missions. The prior knowledge of the environment is encoded using a U-Net deep network architecture. In this context, our work leads to an object search algorithm that is complete, and that outperforms a more traditional frontier-based approach. The topological map we produce uses scene trees to qualitatively represent the environment as a graph at a fraction of the cost of existing SLAM-based solutions. Our results demonstrate that it is possible to extract multi-task semantic information that is useful for navigation and mapping directly from bare-bone, non-semantic measurements.
|
| |
| 09:36-09:42, Paper MoAT19.12 | Add to My Program |
| Enhancing State Estimation in Robots: A Data-Driven Approach with Differentiable Ensemble Kalman Filters |
|
| Liu, Xiao | Arizona State University |
| Clark, Geoffrey | ASU |
| Campbell, Joseph | Carnegie Mellon University |
| Zhou, Yifan | Arizona State University |
| Ben Amor, Heni | Arizona State University |
Keywords: Deep Learning Methods, Deep Learning for Visual Perception, Deep Learning in Grasping and Manipulation
Abstract: This paper introduces a novel state estimation framework for robots using differentiable ensemble Kalman filters (DEnKF). DEnKF is a reformulation of the traditional ensemble Kalman filter that employs stochastic neural networks to model the process noise implicitly. Our work is an extension of previous research on differentiable filters, which has provided a strong foundation for our modular and end-to-end differentiable framework. This framework enables each component of the system to function independently, leading to improved flexibility and versatility in implementation. Through a series of experiments, we demonstrate the flexibility of this model across a diverse set of real-world tracking tasks, including visual odometry and robot manipulation. Moreover, we show that our model effectively handles noisy observations, is robust in the absence of observations, and outperforms state-of-theart differentiable filters in terms of error metrics. Specifically, we observe a significant improvement of at least 59% in translational error when using DEnKF with noisy observations. Our results underscore the potential of DEnKF in advancing state estimation for robotics. Code for DEnKF is available at https://github.com/ir-lab/DEnKF
|
| |
| 09:42-09:48, Paper MoAT19.13 | Add to My Program |
| Self-Supervised Category-Level 6D Object Pose Estimation with Optical Flow Consistency |
|
| Zaccaria, Michela | E80Group S.p.A., University of Parma |
| Manhardt, Fabian | Google |
| Di, Yan | Technical University of Munich |
| Tombari, Federico | Technische Universit�t M�nchen |
| Aleotti, Jacopo | University of Parma |
| Giorgini, Mikhail | University of Parma, Elettric 80 S.p.A |
Keywords: Deep Learning for Visual Perception, Deep Learning Methods, RGB-D Perception
Abstract: Category-level 6D object pose estimation aims at determining the pose of an object of a given category. Most current state-of-the-art methods require a significant amount of real training data to supervise their models. Moreover, annotating the 6D pose is very time consuming, error-prone, and it does not scale well to a large amount of object classes. Therefore, a handful of methods have recently been proposed to use unlabelled data to establish weak supervision. In this letter we propose a self-supervised method that leverages the 2D optical flow as a proxy for supervising the 6D pose. To this purpose, we estimate the 2D optical flow between consecutive frames based on the pose estimation. Then, we harness an off-the-shelf optical flow method to enable weak supervision using a 2D-3D optical flow based consistency loss. Experiments show that our approach for self-supervised learning yields state-of-the-art performance on the NOCS benchmark, and it reaches comparable results with some fully-supervised approaches.
|
| |
| MoAIP Interactive session, Hall E |
|
| Poster M1 |
|
| |
| |
| Subsession MoAIP-01, Hall E | |
| Clone of 'Semantic Scene Understanding' Regular session, 14 papers |
| |
| Subsession MoAIP-02, Hall E | |
| Clone of 'Wearable and Assistive Devices' Regular session, 12 papers |
| |
| Subsession MoAIP-03, Hall E | |
| Clone of 'Collision Avoidance I' Regular session, 13 papers |
| |
| Subsession MoAIP-04, Hall E | |
| Clone of 'Control Applications' Regular session, 14 papers |
| |
| Subsession MoAIP-05, Hall E | |
| Clone of 'Mechanism Design I' Regular session, 14 papers |
| |
| Subsession MoAIP-06, Hall E | |
| Clone of 'Modeling, Control, and Learning for Soft Robots I' Regular session, 13 papers |
| |
| Subsession MoAIP-07, Hall E | |
| Clone of 'Cooperating Robots' Regular session, 13 papers |
| |
| Subsession MoAIP-08, Hall E | |
| Clone of 'Legged Robots I' Regular session, 12 papers |
| |
| Subsession MoAIP-09, Hall E | |
| Clone of 'Motion and Path Planning I' Regular session, 13 papers |
| |
| Subsession MoAIP-10, Hall E | |
| Clone of 'Learning for Manipulation I' Regular session, 13 papers |
| |
| Subsession MoAIP-11, Hall E | |
| Clone of 'Aerial Systems - Applications I' Regular session, 14 papers |
| |
| Subsession MoAIP-12, Hall E | |
| Clone of 'Perception for Grasping and Manipulation I' Regular session, 12 papers |
| |
| Subsession MoAIP-13, Hall E | |
| Clone of 'Visual Learning' Regular session, 12 papers |
| |
| Subsession MoAIP-14, Hall E | |
| Clone of 'Localization I' Regular session, 13 papers |
| |
| Subsession MoAIP-15, Hall E | |
| Clone of 'Sensor Fusion for SLAM' Regular session, 13 papers |
| |
| Subsession MoAIP-16, Hall E | |
| Clone of 'Autonomous Agents' Regular session, 13 papers |
| |
| Subsession MoAIP-17, Hall E | |
| Clone of 'Imitation Learning' Regular session, 14 papers |
| |
| Subsession MoAIP-18, Hall E | |
| Clone of 'Calibration and Identification' Regular session, 14 papers |
| |
| Subsession MoAIP-19, Hall E | |
| Clone of 'Deep Learning Methods I' Regular session, 13 papers |
| |
| 10:00-11:30, Subsession MoAIP-20, Hall E | |
| Late Breaking Posters I Late breaking, 33 papers |
| |
| MoAIP-01 Regular session, Hall E |
Add to My Program |
| Clone of 'Semantic Scene Understanding' |
|
| |
| |
| 10:00-11:30, Paper MoAIP-01.1 | Add to My Program |
| Gaussian Radar Transformer for Semantic Segmentation in Noisy Radar Data |
|
| Zeller, Matthias | CARIAD SE |
| Behley, Jens | University of Bonn |
| Heidingsfeld, Michael | CARIAD SE |
| Stachniss, Cyrill | University of Bonn |
Keywords: Semantic Scene Understanding, Deep Learning Methods
Abstract: Scene understanding is crucial for autonomous robots in dynamic environments for making future state predictions, avoiding collisions, and path planning. Camera and LiDAR perception made tremendous progress in recent years, but face limitations under adverse weather conditions. To leverage the full potential of multi-modal sensor suites, radar sensors are essential for safety critical tasks and are already installed in most new vehicles today. In this paper, we address the problem of semantic segmentation of moving objects in radar point clouds to enhance the perception of the environment with another sensor modality. Instead of aggregating multiple scans to densify the point clouds, we propose a novel approach based on the self-attention mechanism to accurately perform sparse, single-scan segmentation. Our approach, called Gaussian Radar Transformer, includes the newly introduced Gaussian transformer layer, which replaces the softmax normalization by a Gaussian function to decouple the contribution of individual points. To tackle the challenge of the transformer to capture long-range dependencies, we propose our attentive up- and downsampling modules to enlarge the receptive field and capture strong spatial relations. We compare our approach to other state-of-the-art methods on the RadarScenes data set and show superior segmentation quality in diverse environments, even without exploiting temporal information.
|
| |
| 10:00-11:30, Paper MoAIP-01.2 | Add to My Program |
| Mask-Based Panoptic LiDAR Segmentation for Autonomous Driving |
|
| Marcuzzi, Rodrigo | University of Bonn |
| Nunes, Lucas | University of Bonn |
| Wiesmann, Louis | University of Bonn |
| Behley, Jens | University of Bonn |
| Stachniss, Cyrill | University of Bonn |
Keywords: Semantic Scene Understanding, Deep Learning Methods
Abstract: Autonomous vehicles need to understand their surroundings geometrically and semantically to plan and act appropriately in the real world. Panoptic segmentation of LiDAR scans provides a description of the surroundings by unifying semantic and instance segmentation. It is usually solved in a bottom-up manner, consisting of two steps. Predicting the semantic class for 3D each point, using this information to filter out �stuff� points, and cluster the �thing� points to obtain instance segmentation. The clustering is a post-processing step that often needs hyperparameter tuning, which usually does not adapt to instances of different sizes or different datasets. To this end, we propose MaskPLS, an approach to perform panoptic segmentation of LiDAR scans in an end-to-end manner by predicting a set of non-overlapping binary masks and semantic classes, fully avoiding the clustering step. As a result, each mask represents a single instance belonging to a �thing� class or a complete �stuff� class. Experiments on SemanticKITTI show that the end-to-end learnable mask generation leads to superior performance compared to state-of-the-art heuristic approaches.
|
| |
| 10:00-11:30, Paper MoAIP-01.3 | Add to My Program |
| SCENE: Reasoning about Traffic Scenes Using Heterogeneous Graph Neural Networks |
|
| Schmidt, Julian | Mercedes-Benz AG, Ulm University |
| Monninger, Thomas | Mercedes-Benz AG, University of Stuttgart |
| Rupprecht, Jan | Mercedes-Benz AG |
| Raba, David | Mercedes Benz AG |
| Jordan, Julian | Mercedes-Benz AG |
| Frank, Daniel | University of Stuttgart |
| Staab, Steffen | University of Stuttgart |
| Dietmayer, Klaus | University of Ulm |
Keywords: Semantic Scene Understanding, AI-Based Methods, Behavior-Based Systems
Abstract: Understanding traffic scenes requires considering heterogeneous information about dynamic agents and the static infrastructure. In this work we propose SCENE, a methodology to encode diverse traffic scenes in heterogeneous graphs and to reason about these graphs using a heterogeneous Graph Neural Network encoder and task-specific decoders. The heterogeneous graphs, whose structures are defined by an ontology, consist of different nodes with type-specific node features and different relations with type-specific edge features. In order to exploit all the information given by these graphs, we propose to use cascaded layers of graph convolution. The result is an encoding of the scene. Task-specific decoders can be applied to predict desired attributes of the scene. Extensive evaluation on two diverse binary node classification tasks show the main strength of this methodology: despite being generic, it even manages to outperform task-specific baselines. The further application of our methodology to the task of node classification in various knowledge graphs shows its transferability to other domains.
|
| |
| 10:00-11:30, Paper MoAIP-01.4 | Add to My Program |
| Prototypical Contrastive Transfer Learning for Multimodal Language Understanding |
|
| Otsuki, Seitaro | Keio University |
| Ishikawa, Shintaro | Keio University |
| Sugiura, Komei | Keio University |
Keywords: Transfer Learning, Semantic Scene Understanding, Multi-Modal Perception for HRI
Abstract: Although domestic service robots are expected to assist individuals who require support, they cannot currently interact smoothly with people through natural language. For example, given the instruction "Bring me a bottle from the kitchen," it is difficult for such robots to specify the bottle in an indoor environment. Most conventional models have been trained on real-world datasets that are labor-intensive to collect, and they have not fully leveraged simulation data through a transfer learning framework. In this study, we propose a novel transfer learning approach for multimodal language understanding called Prototypical Contrastive Transfer Learning (PCTL), which uses a new contrastive loss called Dual ProtoNCE. We introduce PCTL to the task of identifying target objects in domestic environments according to free-form natural language instructions. To validate PCTL, we built new real-world and simulation datasets. Our experiment demonstrated that PCTL outperformed existing methods. Specifically, PCTL achieved an accuracy of 78.1%, whereas simple fine-tuning achieved an accuracy of 73.4%.
|
| |
| 10:00-11:30, Paper MoAIP-01.5 | Add to My Program |
| Re-Thinking Classification Confidence with Model Quality Quantification |
|
| Pan, Yancheng | Peking University |
| Zhao, Huijing | Peking University |
Keywords: Semantic Scene Understanding, Autonomous Agents
Abstract: Deep neural networks using for real-world classification task require high reliability and robustness. However, the Softmax output by the last layer of network is often over-confident. We propose a novel confidence estimation method by considering model quality for deep classification models. Two metrics, MQ-Repres and MQ-Discri are developed accordingly to evaluate the model quality, and also provide a new confidence estimation called MQ-Conf for online inference. We demonstrate the capability of the proposed method by the 3D semantic segmentation tasks using three different deep networks. Through confusion analysis and feature visualization we show the rationality and reliability of the model quality quantification method.
|
| |
| 10:00-11:30, Paper MoAIP-01.6 | Add to My Program |
| Self-Supervised Drivable Area Segmentation Using LiDAR�s Depth Information for Autonomous Driving |
|
| Ma, Fulong | The Hong Kong University of Science and Technology |
| Liu, Yang | The Hong Kong University of Science and Technology |
| Wang, Sheng | Hong Kong University of Science and Technology |
| Jin, Wu | UESTC |
| Qi, Weiqing | HKUST |
| Liu, Ming | Hong Kong University of Science and Technology |
Keywords: Semantic Scene Understanding, Perception for Grasping and Manipulation, Mapping
Abstract: Drivable area segmentation is an essential component of the visual perception system for autonomous driving vehicles. Recent efforts in deep neural networks have significantly improved semantic segmentation performance for autonomous driving. However, most DNN-based methods need a large amount of data to train the models, and collecting large-scale datasets with manually labeled ground truth is costly, tedious, time consuming and requires the availability of experts, making DNN-based methods often difficult to implement in real world applications. Hence, in this paper, we introduce a novel module named automatic data labeler (ADL), which leverages a deterministic LiDAR-based method for ground plane segmentation and road boundary detection to create large datasets suitable for training DNNs. Furthermore, since the data generated by our ADL module is not as accurate as the manually annotated data, we introduce uncertainty estimation to compensate for the gap between the human labeler and our ADL. Finally, we train the semantic segmentation neural networks using our automatically generated labels on the KITTI dataset and KITTI-CARLA dataset. The experimental results demonstrate that our proposed ADL method not only achieves impressive performance compared to manual labeling but also exhibits more robust and accurate results than both traditional methods and state-of-the-art self-supervised methods.
|
| |
| 10:00-11:30, Paper MoAIP-01.7 | Add to My Program |
| Vehicle Motion Forecasting Using Prior Information and Semantic-Assisted Occupancy Grid Maps |
|
| Asghar, Rabbia | INRIA / Univ. Grenoble Alpes |
| Diaz-Zapata, Manuel | Inria Grenoble |
| Rummelhard, Lukas | INRIA |
| Spalanzani, Anne | INRIA / Univ. Grenoble Alpes |
| Laugier, Christian | INRIA |
Keywords: Semantic Scene Understanding, Deep Learning Methods, Autonomous Vehicle Navigation
Abstract: Motion prediction is a challenging task for autonomous vehicles due to uncertainty in the sensor data, the non-deterministic nature of future, and complex behavior of agents. In this paper, we tackle this problem by representing the scene as dynamic occupancy grid maps (DOGMs), associating semantic labels to the occupied cells and incorporating map information. We propose a novel framework that combines deep-learning-based spatio-temporal and probabilistic approaches to predict multimodal vehicle behaviors. Contrary to the conventional OGM prediction methods, evaluation of our work is conducted against the ground truth annotations. We experiment and validate our results on real-world NuScenes dataset and show that our model shows superior ability to predict both static and dynamic vehicles compared to OGM predictions. Furthermore, we perform an ablation study and assess the role of semantic labels and map in the architecture.
|
| |
| 10:00-11:30, Paper MoAIP-01.8 | Add to My Program |
| Enhance Local Feature Consistency with Structure Similarity Loss for 3D Semantic Segmentation |
|
| Lin, Cheng-Wei | Department of Computer Science, National Yang Ming Chiao Tung Un |
| Syu, Fang-Yu | Department of Computer Science, National Yang Ming Chiao Tung Un |
| Pan, Yi-Ju | National Yang Ming Chiao Tung University |
| Chen, Kuan-Wen | National Yang Ming Chiao Tung University |
Keywords: Semantic Scene Understanding, Object Detection, Segmentation and Categorization, Deep Learning for Visual Perception
Abstract: Recently, many research studies have been carried out on using deep learning methods for 3D point cloud understanding. However, there is still no remarkable result on 3D point cloud semantic segmentation compared to those of 2D research. One important reason is that 3D data has higher dimensionality but lacks large datasets, which means that the deep learning model is difficult to optimize and easy to overfit. To overcome this, an essential method is to provide more priors to the learning of deep models. In this paper, we focus on semantic segmentation for point clouds in the real world. To provide priors to the model, we propose a novel loss function called Linearity and Planarity to enhance local feature consistency in the regions with similar local structure. Experiments show that the proposed method improves baseline performance on both indoor and outdoor datasets e.g. S3DIS and Semantic3D.
|
| |
| 10:00-11:30, Paper MoAIP-01.9 | Add to My Program |
| Lightweight Semantic Segmentation Network for Semantic Scene Understanding on Low-Compute Devices |
|
| Son, Hojun | University of Michigan |
| Weiland, James | University of Michigan |
Keywords: Semantic Scene Understanding, Embedded Systems for Robotic and Automation, Deep Learning for Visual Perception
Abstract: Semantic scene understanding is beneficial for mobile robots. Semantic information obtained through onboard cameras can improve robots� navigation performance. However, obtaining semantic information on small mobile robots with constrained power and computation resources is challenging. We propose a new lightweight convolution neural network comparable to previous semantic segmentation algorithms for mobile applications. Our network achieved 73.06% on the Cityscapes validation set and 71.8% on the Cityscapes test set. Our model runs at 116 FPS with 1024x2048, 172 fps with 1024x1024, and 175 FPS with 720x960 on NVIDIA GTX 1080. We analyze a model size, which is defined as the summation of the number of floating operations and the number of parameters. The smaller model size enables tiny mobile robot systems that should operate multiple tasks simultaneously to work efficiently. Our model has the smallest model size compared to the real-time semantic segmentation convolution neural networks ranked on Cityscapes real-time benchmark and other high-performing, lightweight convolution neural networks. On the Camvid test set, our model achieved a mIoU of 73.29% with Cityscapes pre-training, which outperformed the accuracy of other lightweight convolution neural networks. For mobile applicability, we measured frame-per-second on different low-compute devices. Our model operates 35 FPS on Jetson Xavier AGX, 21 FPS on Jetson Xavier NX, and 14 FPS on a ROS ASUS gaming phone. 1024x2048 resolution is used for the Jetson devices, and 512x512 size is utilized for the measurement on the phone. Our network did not use extra datasets such as ImageNet, Coarse Cityscapes, and Mapillary. Additionally, we did not use TensorRT to achieve fast inference speed. Compared to other real-time and lightweight CNNs, our model achieved significantly more efficiency while balancing accuracy, inference speed, and model size.
|
| |
| 10:00-11:30, Paper MoAIP-01.10 | Add to My Program |
| LiDAR-SGMOS: Semantics-Guided Moving Object Segmentation with 3D LiDAR |
|
| Gu, Shuo | Nanjing University of Science and Technology |
| Yao, Suling | Nanjing University of Science and Technology |
| Yang, Jian | Nanjing University of Science & Technology |
| Xu, Chengzhong | University of Macau |
| Kong, Hui | University of Macau |
Keywords: Semantic Scene Understanding, Object Detection, Segmentation and Categorization, Deep Learning Methods
Abstract: Most of the existing moving object segmentation (MOS) methods regard MOS as an independent task, in this paper, we associate the MOS task with semantic segmentation, and propose a semantics-guided network for moving object segmentation (LiDAR-SGMOS). We first transform the range image and semantic features of the past scan into the range view of current scan based on the relative pose between scans. The residual image is obtained by calculating the normalized absolute difference between the current and transformed range images. Then, we apply a Meta-Kernel based cross scan fusion (CSF) module to adaptively fuse the range images and semantic features of current scan, the residual image and transformed features. Finally, the fused features with rich motion and semantic information are processed to obtain reliable MOS results. We also introduce a residual image augmentation method to further improve the MOS performance. Our method outperforms most LiDAR-MOS methods with only two sequential LiDAR scans as inputs on the SemanticKITTI MOS dataset.
|
| |
| 10:00-11:30, Paper MoAIP-01.11 | Add to My Program |
| Robust Fusion for Bayesian Semantic Mapping |
|
| Morilla-Cabello, David | Universidad De Zaragoza |
| Mur Labadia, Lorenzo | University of Zaragoza |
| Martinez-Cantin, Ruben | University of Zaragoza |
| Montijano, Eduardo | Universidad De Zaragoza |
Keywords: Semantic Scene Understanding, Mapping, Deep Learning for Visual Perception
Abstract: The integration of semantic information in a map allows robots to understand better their environment and make high-level decisions. In the last few years, neural networks have shown enormous progress in their perception capabilities. However, when fusing multiple observations from a neural network in a semantic map, its inherent overconfidence with unknown data gives too much weight to the outliers and decreases the robustness. To mitigate this issue we propose a novel robust fusion method to combine multiple Bayesian semantic predictions. Our method uses the uncertainty estimation provided by a Bayesian neural network to calibrate the way in which the measurements are fused. This is done by regularizing the observations to mitigate the problem of overconfident outlier predictions and using the epistemic uncertainty to weigh their influence in the fusion, resulting in a different formulation of the probability distributions. We validate our robust fusion strategy by performing experiments on photo-realistic simulated environments and real scenes. In both cases, we use a network trained on different data to expose the model to varying data distributions. The results show that considering the model's uncertainty and regularizing the probability distribution of the observations distribution results in a better semantic segmentation performance and more robustness to outliers, compared with other methods.
|
| |
| 10:00-11:30, Paper MoAIP-01.12 | Add to My Program |
| ConSOR: A Context-Aware Semantic Object Rearrangement Framework for Partially Arranged Scenes |
|
| Ramachandruni, Kartik | Georgia Institute of Technology |
| Zuo, Max | Georgia Institute of Technology |
| Chernova, Sonia | Georgia Institute of Technology |
Keywords: Semantic Scene Understanding, Deep Learning Methods
Abstract: Object rearrangement is the problem of enabling a robot to identify the correct object placement in a complex environment. Prior work on object rearrangement has explored a diverse set of techniques for following user instructions to achieve some desired goal state. Logical predicates, images of the goal scene, and natural language descriptions have all been used to instruct a robot in how to arrange objects. In this work, we argue that burdening the user with specifying goal scenes is not necessary in partially-arranged environments, such as common household settings. Instead, we show that contextual cues from partially arranged scenes (i.e., the placement of some number of pre-arranged objects in the environment) provide sufficient context to enable robots to perform object rearrangement without any explicit user goal specification. We introduce ConSOR, a Context-aware Semantic Object Rearrangement framework that utilizes contextual cues from a partially arranged initial state of the environment to complete the arrangement of new objects, without explicit goal specification from the user. We demonstrate that ConSOR strongly outperforms two baselines in generalizing to novel object arrangements and unseen object categories. The code and data are available at https://github.com/kartikvrama/consor.
|
| |
| 10:00-11:30, Paper MoAIP-01.13 | Add to My Program |
| IDA: Informed Domain Adaptive Semantic Segmentation |
|
| Chen, Zheng | Indiana University Bloomington |
| Ding, Zhengming | Tulane University |
| Gregory, Jason M. | US Army Research Laboratory |
| Liu, Lantao | Indiana University |
Keywords: Semantic Scene Understanding, Deep Learning for Visual Perception, Object Detection, Segmentation and Categorization
Abstract: Mixup-based data augmentation has been validated to be a critical stage in the self-training framework for unsupervised domain adaptive semantic segmentation (UDA-SS), which aims to transfer knowledge from a well-annotated (source) domain to an unlabeled (target) domain. Existing self-training methods usually adopt the popular region-based mixup techniques with a random sampling strategy, which unfortunately ignores the dynamic evolution of different semantics across various domains as training proceeds. To improve the UDA-SS performance, we propose an Informed Domain Adaptation (IDA) model, a self-training framework that mixes the data based on class-level segmentation performance, which aims to emphasize small-region semantics during mixup. In our IDA model, the class-level performance is tracked by an expected confidence score (ECS). We then use a dynamic schedule to determine the mixing ratio for data in different domains. Extensive experimental results reveal that our proposed method is able to outperform the state-of-the-art UDA-SS method by a margin of 1.1 mIoU in the adaptation of GTA-V to Cityscapes and of 0.9 mIoU in the adaptation of SYNTHIA to Cityscapes.
|
| |
| 10:00-11:30, Paper MoAIP-01.14 | Add to My Program |
| Self-Supervised Learning for Panoptic Segmentation of Multiple Fruit Flower Species |
|
| Siddique, Abubakar | Marquette University |
| Tabb, Amy | USDA-ARS-AFRS |
| Medeiros, Henry | University of Florida |
Keywords: Semantic Scene Understanding, Object Detection, Segmentation and Categorization, Incremental Learning
Abstract: Convolutional neural networks trained using manually generated labels are commonly used for semantic or instance segmentation. In precision agriculture, automated flower detection methods use supervised models and post-processing techniques that may not perform consistently as the appearance of the flowers and the data acquisition conditions vary. We propose a self-supervised learning strategy to enhance the sensitivity of segmentation models to different flower species using automatically generated pseudo-labels. We employ a data augmentation and refinement approach to improve the accuracy of the model predictions. The augmented semantic predictions are then converted to panoptic pseudo-labels to iteratively train a multi-task model. The self-supervised model predictions can be refined with existing post-processing approaches to further improve their accuracy. An evaluation on a multi-species fruit tree flower dataset demonstrates that our method outperforms state-of-the-art models without computationally expensive post-processing steps, providing a new baseline for flower detection applications.
|
| |
| MoAIP-02 Regular session, Hall E |
Add to My Program |
| Clone of 'Wearable and Assistive Devices' |
|
| |
| |
| 10:00-11:30, Paper MoAIP-02.1 | Add to My Program |
| Combined Admittance Control with Type II Singularity Evasion for Parallel Robots Using Dynamic Movement Primitives (I) |
|
| Escarabajal, Rafael J. | Universidad Polit�cnica De Valencia |
| Pulloquinga, Jos� Luis | Universidad Polit�cnica De Valencia |
| Valera, Angel | Universidad Polit�cnica De Valencia |
| Mata, Vicente | Universidad Polit�cnica De Valencia |
| Valles, Marina | Universitat Polit�cnica De Val�ncia |
| Castillo-Garc�a, Fernando J. | Universidad De Castilla-La Mancha |
Keywords: Rehabilitation Robotics, Parallel Robots, Compliance and Impedance Control, Dynamic Movement Primitives
Abstract: This paper addresses a new way of generating compliant trajectories for control using movement primitives to allow physical human-robot interaction where parallel robots (PRs) are involved. PRs are suitable for tasks requiring precision and performance because of their robust behavior. However, two fundamental issues must be resolved to ensure safe operation: i) the force exerted on the human must be controlled and limited, and ii) Type II singularities should be avoided to keep complete control of the robot. We offer a unified solution under the Dynamic Movement Primitives (DMP) framework to tackle both tasks simultaneously. DMPs are used to get an abstract representation for movement generation and are involved in broad areas such as imitation learning and movement recognition. For force control, we design an admittance controller intrinsically defined within the DMP structure, and subsequently, the Type II singularity evasion layer is added to the system. Both the admittance controller and the evader exploit the dynamic behavior of the DMP and its properties related to invariance and temporal coupling, and the whole system is deployed in a real PR meant for knee rehabilitation. The results show the capability of the system to perform safe rehabilitation exercises.
|
| |
| 10:00-11:30, Paper MoAIP-02.2 | Add to My Program |
| A Handle Robot for Providing Bodily Support to Elderly Persons |
|
| Bolli, Roberto | MIT |
| Bonato, Paolo | Harvard Medical School |
| Asada, Harry | MIT |
Keywords: Physically Assistive Devices, Human-Robot Collaboration, Domestic Robotics
Abstract: Age-related loss of mobility and an increased risk of falling remain major obstacles for older adults to live independently. Many elderly people lack the coordination and strength necessary to perform activities of daily living, such as getting out of bed or stepping into a bathtub. A traditional solution is to install grab bars around the home. For assisting in bathtub transitions, grab bars are fixed to a bathroom wall. However, they are often too far to reach and stably support the user; the installation locations of grab bars are constrained by the room layout and are often suboptimal. In this paper, we present a mobile robot that provides an older adult with a handlebar located anywhere in space - �Handle Anywhere�. The robot consists of an omnidirectional mobile base attached to a repositionable handlebar. We further develop a methodology to optimally place the handle to provide the maximum support for the elderly user while performing common postural changes. A cost function with a trade-off between mechanical advantage and manipulability of the user�s arm was optimized in terms of the location of the handlebar relative to the user. The methodology requires only a sagittal plane video of the elderly user performing the postural change, and thus is rapid, scalable, and uniquely customizable to each user. A proof-of-concept prototype was built, and the optimization algorithm for handle location was validated experimentally.
|
| |
| 10:00-11:30, Paper MoAIP-02.3 | Add to My Program |
| A Hybrid FNS Generator for Human Trunk Posture Control with Incomplete Knowledge of Neuromusculoskeletal Dynamics |
|
| Bao, Xuefeng | Case Western Reserve University |
| Friederich, Aidan | Case Western Reserve University |
| Triolo, Ronald | Case Western Reserve University |
| Audu, Musa. L. | Case Western Reserve University |
Keywords: Rehabilitation Robotics, Modeling and Simulating Humans, Motion Control
Abstract: The trunk movements of an individual paralyzed by spinal cord injury (SCI) can be restored by Functional Neuromuscular Stimulation (FNS), a technique that applies low-level current to motor nerves to activate the muscles generating torques, and thus, produce trunk motions. FNS can be modulated to control trunk movements. However, a stabilizing modulation policy (i.e., control law) is difficult to derive due to the complexity of neuromusculoskeletal dynamics, which consist of skeletal dynamics (i.e., multi-joint rigid body dynamics) and neuromuscular dynamics (i.e., a highly nonlinear, nonautonomous, and input redundant dynamics). Therefore, an FNS-based control method that can stabilize the trunk without knowing the accurate skeletal and neuromuscular dynamics is desired. This work proposed an FNS generator, which consists of a robust nonlinear controller (RNC) that provides stabilizing torque command and an artificial neural network (ANN)- based torque-to-activation (T-A) map to ensure that the muscle generates the stabilizing torque to the skeleton. Due to the robustness and learning capability of this control framework, full knowledge of the trunk neuromusculoskeletal dynamics is not required. The proposed control framework has been tested in a simulation environment where an anatomically realistic 3D musculoskeletal model of the human trunk was manipulated to follow a time-varying reference that moves in the anterior-posterior and medial-lateral directions. From the results, it can be seen that the trunk motion converges to a satisfactory trajectory while the ANN is being updated. The results suggest the potential of this control framework for trunk tracking tasks in a clinical application.
|
| |
| 10:00-11:30, Paper MoAIP-02.4 | Add to My Program |
| Insole-Type Walking Assist Device Capable of Inducing Inversion-Eversion of the Ankle Angle to the Neutral Position |
|
| Itami, Taku | Aoyama Gakuin University |
| Date, Kazuki | Aoyama Gakuin University |
| Ishii, Yuuta | Aoyama Gakuin University |
| Yoneyama, Jun | Aoyama Gakuin University |
| Aoki, Takaaki | Gifu University |
Keywords: Prosthetics and Exoskeletons, Robotics and Automation in Life Sciences, Body Balancing
Abstract: In recent years, the aging of society has become a serious problem, especially in developed countries. Walking is an important element in extending healthy life expectancy in old age. In particular, induction of proper ankle joint alignment at heel contact is important during the gait cycle from the perspective of smooth weight transfer and reduction of burden on the knees and hip. In this study, we focus on the behavior of the ankle joint at heel contact and propose an insole-type assist device that can induce the ankle angle inversion/eversion rotation. The proposed device has tilting of the heel part from left to right in response to the rotation of a stepping motor, and an inertial sensor mounted inside controls the heel part to always maintain a horizontal position. The effectiveness of the proposed device is verified by evaluating the amount of lateral thrust of the knee joint of six healthy male subjects during a foot-stepping motion using motion capture system. The results showed that the amount of lateral thrust is significantly reduced by wearing the device with control.
|
| |
| 10:00-11:30, Paper MoAIP-02.5 | Add to My Program |
| Design for Hip Abduction Assistive Device Based on Relationship between Hip Joint Motion and Torque During Running |
|
| Lee, Myunghyun | Agency for Defense Development |
| Hong, Man Bok | Agency for Defense Development |
| Kim, Gwang Tae | Agency for Defense Development |
| Kim, Seonwoo | Agency for Defense Development |
Keywords: Physically Assistive Devices, Human Performance Augmentation, Mechanism Design
Abstract: Numerous attempts have been made to reduce metabolic energy while running with the help of assistive devices. A majority of studies on the assistive devices have focused on the assisting torque in the sagittal plane. In the case of running, however, the abduction torque in the frontal plane at the hip joint is greater than the flexion/extension torque in the sagittal plane. During running, as does an elastic body, the abduction torque and the motion of the hip joint have a linear relationship, but are opposite in direction. It is expected that the hip abduction torque can be assisted with a simple passive method by using an elastic body that reflects the movement characteristics of the hip joint. In this study, therefore, a system to assist hip abduction torque using a leaf spring was proposed with a prototype testing. While running with the assist system proposed, the leaf spring aids the abduction torque on the stance phase, and the torque is not generated due to the passive revolute joint on the swing phase. The joint angle is changed with respective to the rotation in the flexion/extension direction to prevent discomfort torque during swing phase and to increase the duration of the torque action during stance phase. A preliminary test was conducted on one subject using the prototype of the hip joint abduction torque assistive device. The participant with the assistive device reduced metabolic energy by 5% compared to the case without abduction torque assist while running at 2.5m/s. In order to increase the amount of metabolic reduction, the device shall be supplemented by system mass reduction and hip joint position optimization.
|
| |
| 10:00-11:30, Paper MoAIP-02.6 | Add to My Program |
| Dynamic Hand Proprioception Via a Wearable Glove with Fabric Sensors |
|
| Behnke, Lily | Yale University |
| Sanchez-Botero, Lina | Yale University |
| Johnson, William | Yale University |
| Agrawala, Anjali | Yale University |
| Kramer-Bottiglio, Rebecca | Yale University |
Keywords: Wearable Robotics, Soft Sensors and Actuators, Soft Robot Materials and Design
Abstract: Continuous enhancement in wearable technologies has led to several innovations in the healthcare, virtual reality, and robotics sectors. One form of wearable technology is wearable sensors for kinematic measurements of human motion. However, measuring the kinematics of human movement is a challenging problem as wearable sensors need to conform to complex curvatures and deform without limiting the user's natural range of motion. In fine motor activities, such challenges are further exacerbated by the dense packing of several joints, coupled joint motions, and relatively small deformations. This work presents the design, fabrication, and characterization of a thin, breathable sensing glove capable of reconstructing fine motor kinematics. The fabric glove features capacitive sensors made from layers of conductive and dielectric fabrics, culminating in a non-bulky and discrete glove design. This study demonstrates that the glove can reconstruct the joint angles of the wearer with a root mean square error of 7.2 degrees, indicating promising applicability to dynamic pose reconstruction for wearable technology and robot teleoperation.
|
| |
| 10:00-11:30, Paper MoAIP-02.7 | Add to My Program |
| A Wearable Robotic Rehabilitation System for Neuro-Rehabilitation Aimed at Enhancing Mediolateral Balance |
|
| Yu, Zhenyuan | North Carolina State University |
| Nalam, Varun | North Carolina State University |
| Alili, Abbas | NC State University |
| Huang, He (Helen) | North Carolina State University and University of North Carolina |
Keywords: Rehabilitation Robotics, Prosthetics and Exoskeletons, Physical Human-Robot Interaction
Abstract: There is increasing evidence of the role of compromised mediolateral balance in falls and the need for rehabilitation specifically focused on mediolateral direction for various populations with motor deficits. To address this need, we have developed a neurorehabilitation platform by integrating a wearable robotic hip abduction-adduction exoskeleton with a visual interface. The platform is expected to influence and rehabilitate the underlying visuomotor mechanisms in individuals by having users perform motion tasks based on visual feedback while the robot applies various controlled resistances governed by the admittance controller implemented in the robot. A preliminary study was performed on 3 non disabled individuals to analyze the performance of the system and observe any adaptation in hip joint kinematics and kinetics as a result of the visuomotor training under 4 different admittance conditions. All three subjects exhibited increased consistency of motion during training and interlimb coordination to achieve motion tasks, demonstrating the utility of the system. Further analysis of observed human-robot torque interactions and electromyography (EMG) signals, and its implication in neurorehabilitation aimed at populations suffering from chronic stroke are discussed.
|
| |
| 10:00-11:30, Paper MoAIP-02.8 | Add to My Program |
| Analysis of Lower Extremity Shape Characteristics in Various Walking Situations for the Development of Wearable Robot |
|
| Park, Joohyun | KAIST, KIST |
| Choi, Ho Seon | Yonsei University |
| In, HyunKi | Korea Institute of Science and Technology |
Keywords: Datasets for Human Motion, Wearable Robotics, Physical Human-Robot Interaction
Abstract: A strap is a frequently utilized component for securing wearable robots to their users in order to facilitate force transmission between humans and the devices. For the appropriate function of the wearable robot, the pressure between the strap and the skin should be maintained appropriately. Due to muscle contraction, the cross-section area of the human limb changes according to the movement of the muscle. The cross-section area change causes the change in the pressure applied by the strap. Therefore, for a new strap design to resolve this, it is necessary to understand the shape change characteristics of the muscle where the strap is applied. In this paper, the change in the circumference of the thigh and the calf during walking was measured and analyzed by multiple string pot sensors. With a treadmill and string pot sensors using potentiometers, torsion springs, and leg circumference changes were measured for different walking speeds and slopes. And, gait cycles were divided according to a signal from the FSR sensor inserted in the right shoe. From the experimental results, there were changes in the circumference of about 8.5mm and 3mm for the thigh and the calf, respectively. And we found tendencies in various walking circumstances such as walking speed and degree of the slope. It is confirmed that they can be used for estimation algorithms of gait cycles or gait circumstances.
|
| |
| 10:00-11:30, Paper MoAIP-02.9 | Add to My Program |
| Finding Biomechanically Safe Trajectories for Robot Manipulation of the Human Body in a Search and Rescue Scenario |
|
| Peiros, Lizzie | University of California, San Diego |
| Chiu, Zih-Yun | University of California, San Diego |
| Zhi, Yuheng | University of California, San Diego |
| Shinde, Nikhil | University of California San Diego |
| Yip, Michael C. | University of California, San Diego |
Keywords: Physical Human-Robot Interaction, Modeling and Simulating Humans, Dynamics
Abstract: There has been increasing awareness of the difficulties in reaching and extracting people from mass casualty scenarios, such as those arising from natural disasters. While platforms have been designed to consider reaching casualties and even carrying them out of harm's way, the challenge of physically repositioning a casualty from its found configuration to one suitable for extraction has not been explicitly explored. Furthermore, this type of planning problem needs to incorporate biomechanical safety considerations for the casualty. Thus, we present the problem formulation for biomechanically safe trajectory generation for repositioning limbs of unconscious human casualties. We describe biomechanical safety in robotics terms, describe mechanical descriptions of the dynamics of the robot-human coupled system, and the planning and trajectory optimization process that considers this coupled and constrained system. We finally evaluate the work over several variations of the problem and provide a live example. This work provides a crucial part of search and rescue that can be used in conjunction with past and present works involving robots and vision systems designed for search and rescue.
|
| |
| 10:00-11:30, Paper MoAIP-02.10 | Add to My Program |
| Mechanical Characterisation of Woven Pneumatic Active Textile |
|
| Marshall, Ruby | The University of Edinburgh |
| Souppez, Jean-Baptiste | Aston University |
| Khan, Mariya | Aston University |
| Viola, Ignazio Maria | University of Edinburgh |
| Nabae, Hiroyuki | Tokyo Institute of Technology |
| Suzumori, Koichi | Tokyo Institute of Technology |
| Stokes, Adam Andrew | University of Edinburgh |
| Giorgio-Serchi, Francesco | University of Edinburgh |
Keywords: Wearable Robotics, Soft Robot Materials and Design, Hydraulic/Pneumatic Actuators
Abstract: Active textiles have shown promising applications in soft robotics owing to their tunable stiffness and design flexibility. Given the breadth of the design space for planar and spatial arrangements of these woven structures, a rig- orous and generalizable characterisation of these systems is not yet available. In order to characterize the response of a stereotypical woven pattern to actuation, we undertake a parametric study of plain weave active fabrics and characterise their mechanical properties in accordance with the relevant ISO standards for varying muscle densities and both monotonically increasing/decreasing pressures. Tensile and flexural tests were undertaken on five plain weave samples made of a nylon 6 (polyamide) warp and EM20 McKibben S-muscle weft, for input pressures ranging from 0.00 MPa to 0.60 MPa, at three muscle densities, namely 100 m^-1, 74.26 m^-1 and 47.62 m^-1. Contrary to intuition, we find that a lower muscle density has a more prominent impact on the thickness, but a significantly lesser one on length, highlighting a critical dependency on the relative orientation among the loading, the passive textile and the muscle filaments. Hysteretic behaviour as large as 10% of the longitudinal contraction is observed on individual filaments and woven textiles, and its onset is identified in the shear between the rubber tube and the outer sleeve of the artificial muscle. Hysteresis is shown to be muscle density-dependent and responsible for a strongly asymmetrical response upon different pressure inputs. These findings provide new insights into the mechanical properties of active textiles with tunable stiffness, and may contribute to future developments in wearable technologies and biomedical devices.
|
| |
| 10:00-11:30, Paper MoAIP-02.11 | Add to My Program |
| Adaptive Symmetry Reference Trajectory Generation in Shared Autonomy for Active Knee Orthosis |
|
| Liu, Rongkai | University of Science and Technology of China(USTC) |
| Ma, Tingting | Chinese Academy of Sciences |
| Yao, Ningguang | University of Science and Technology of China |
| Li, Hao | Chinese Academy of Sciences |
| Zhao, Xinyan | University of Science and Technology of China |
| Wang, Yu | University of Science and Technology of China |
| Pan, Hongqing | Hefei Institutes of Physical Science |
| Song, Quanjun | Chinese Academy of Science |
Keywords: Human-Centered Robotics, Rehabilitation Robotics, Human-Robot Collaboration
Abstract: Gait symmetry training plays an essential role in the rehabilitation of hemiplegic patients and robotics-based gait training has been widely accepted by patients and clinicians. Reference trajectory generation for the affected side using the motion data of the unaffected side is an important way to achieve this. However, online generation gait reference trajectory requires the algorithm to provide correct gait phase delay and could reduce the impact of measurement noise from sensors and input uncertainty from users. Based on an active knee orthosis (AKO) prototype, this work presents an adaptive symmetric gait trajectory generation framework for the gait rehabilitation of hemiplegic patients. Using the adaptive nonlinear frequency oscillators (ANFO) and movement primitives, we implement online gait pattern encoding and adaptive phase delay according to the real-time user input. A shared autonomy (SA) module with online input validation and arbitration has been designed to prevent undesired movements from being transmitted to the actuator on the affected side. The experimental results demonstrate the feasibility of the framework. Overall, this work suggests that the proposed method has the potential to perform gait symmetry rehabilitation in an unstructured environment and provide a kinematic reference for torque-assist AKO.
|
| |
| 10:00-11:30, Paper MoAIP-02.12 | Add to My Program |
| Data-Driven Modeling for Gait Phase Recognition in a Wearable Exoskeleton Using Estimated Forces (I) |
|
| Park, Kyeong-Won | Republic of Korea Air Force Academy |
| Choi, Jungsu | Yeungnam University |
| Kong, Kyoungchul | Korea Advanced Institute of Science and Technology |
Keywords: Wearable Robots, AI-Based Methods, Human-Centered Robotics, Robust/Adaptive Control of Robotic Systems
Abstract: Accurate identification of gait phases is critical in effectively assessing the assistance provided by lower-limb exoskeletons. In this study, we propose a novel gait phase recognition system called ObsNet to analyze the gait of individuals with spinal cord injuries (SCI). To ensure the reliable use of exoskeletons, it is essential to maintain practicality and avoid exposing the system to unnecessary risks of fatigue, inaccuracy, or incompatibility with human-centered devices. Therefore, we propose a new approach to characterize exoskeletal-assisted gait by estimating forces on exoskeletal joints during walking. Although these estimated forces are potentially useful for detecting gait phases, their nonlinearities make it challenging for existing algorithms to generalize accurately. To address this challenge, we introduce a data-driven model that simultaneously captures both feature extraction and order dependencies, and enhance its performance through a threshold-based compensational method to filter out momentary errors. We evaluated the effectiveness of ObsNet through robotic walking experiments with two practical users with complete paraplegia. Our results indicate that ObsNet outperformed state-of-the-art methods that use joint information and other recurrent networks in identifying the gait phases of individuals with SCI (p < 0.05). We also observed reliable imitation of ground truth after compensation. Overall, our research highlights the potential of wearable technology to improve the daily lives of individuals with disabilities through accurate and stable state assessment.
|
| |
| MoAIP-03 Regular session, Hall E |
Add to My Program |
| Clone of 'Collision Avoidance I' |
|
| |
| |
| 10:00-11:30, Paper MoAIP-03.1 | Add to My Program |
| Dynamic Multi-Query Motion Planning with Differential Constraints and Moving Goals |
|
| Gentner, Michael | Technical University of Munich and BMW AG |
| Zillenbiller, Fabian | Technical University of Munich and BMW AG |
| Kraft, Andr� | BMW AG, Germany |
| Steinbach, Eckehard | Technical University of Munich |
Keywords: Collision Avoidance, Motion and Path Planning, Industrial Robots
Abstract: Planning robot motions in complex environments is a fundamental research challenge and central to the autonomy, efficiency, and ultimately adoption of robots. While often the environment is assumed to be static, real-world settings, such as assembly lines, contain complex shaped, moving obstacles and changing target states. Therein robots must perform safe and efficient motions to achieve their tasks. In repetitive environments and multi-goal settings, reusable roadmaps can substantially reduce the overall query time. Most dynamic roadmap-based planners operate in state-time-space, which is computationally demanding. Interval-based methods store availabilities as node attributes and thereby circumvent the dimensionality increase. However, current approaches do not consider higher-order constraints, which can ultimately lead to collisions during execution. Furthermore, current approaches must replan when the goal changes. To this end, we propose a novel roadmap-based planner for systems with third-order differential constraints operating in dynamic environments with moving goals. We construct a roadmap with availabilities as node attributes. During the query phase, we use a Double-Integrator Minimum Time (DIMT) solver to recursively build feasible trajectories and accurately estimate arrival times. An exit node set in combination with a moving goal heuristic is used to efficiently find the fastest path through the roadmap to the moving goal. We evaluate our method with a simulated UAV operating in dynamic 2D environments and show that it also transfers to a 6-DoF manipulator. We show higher success rates than other state-of-the-art methods both in collision avoidance and reaching a moving goal.
|
| |
| 10:00-11:30, Paper MoAIP-03.2 | Add to My Program |
| Reactive and Safe Co-Navigation with Haptic Guidance |
|
| Coffey, Mela | Boston University |
| Zhang, Dawei | Boston University |
| Tron, Roberto | Boston University |
| Pierson, Alyssa | Boston University |
Keywords: Collision Avoidance, Telerobotics and Teleoperation, Human-Robot Collaboration
Abstract: We propose a co-navigation algorithm that enables a human and a robot to work together to navigate to a common goal. In this system, the human is responsible for making high-level steering decisions, and the robot, in turn, provides haptic feedback for collision avoidance and path suggestions while reacting to changes in the environment. Our algorithm uses optimized Rapidly-exploring Random Trees (RRT*) to generate paths to lead the user to the goal, via an attractive force feedback computed using a Control Lyapunov Function (CLF). We simultaneously ensure collision avoidance where necessary using a Control Barrier Function (CBF). We demonstrate our approach using simulations with a virtual pilot, and hardware experiments with a human pilot. Our results show that combining RRT* and CBFs is a promising tool for enabling collaborative human-robot navigation.
|
| |
| 10:00-11:30, Paper MoAIP-03.3 | Add to My Program |
| An MCTS-DRL Based Obstacle and Occlusion Avoidance Methodology in Robotic Follow-Ahead Applications |
|
| Leisiazar, Sahar | Simon Fraser University |
| Park, Edward J. | Simon Fraser University |
| Lim, Angelica | Simon Fraser University |
| Chen, Mo | Simon Fraser University |
Keywords: Robot Companions, Collision Avoidance, AI-Enabled Robotics
Abstract: We propose a novel methodology for robotic follow-ahead applications that address the critical challenge of obstacle and occlusion avoidance. Our approach effectively navigates the robot while ensuring avoidance of collisions and occlusions caused by surrounding objects. To achieve this, we developed a high-level decision-making algorithm that generates short-term navigational goals for the mobile robot. Monte Carlo Tree Search is integrated with a Deep Reinforcement Learning method to enhance the performance of the decision-making process and generate more reliable navigational goals. Through extensive experimentation and analysis, we demonstrate the effectiveness and superiority of our proposed approach in comparison to the existing follow-ahead human-following robotic methods. Our code is available at https://github.com/saharLeisiazar/follow-ahead-ros.
|
| |
| 10:00-11:30, Paper MoAIP-03.4 | Add to My Program |
| Proactive Model Predictive Control with Multi-Modal Human Motion Prediction in Cluttered Dynamic Environments |
|
| Heuer, Lukas | �rebro University, Robert Bosch GmbH |
| Palmieri, Luigi | Robert Bosch GmbH |
| Rudenko, Andrey | Robert Bosch GmbH |
| Mannucci, Anna | Robert Bosch GmbH Corporate Research |
| Magnusson, Martin | �rebro University |
| Arras, Kai Oliver | Bosch Research |
Keywords: Collision Avoidance, Human-Aware Motion Planning, Motion and Path Planning
Abstract: For robots navigating in dynamic environments, exploiting and understanding uncertain human motion prediction is key to generate efficient, safe and legible actions. The robot may perform poorly and cause hindrances if it does not reason over possible, multi-modal future social interactions. With the goal of further enhancing autonomous navigation in cluttered environments, we propose a novel formulation for nonlinear model predictive control including multi-modal predictions of human motion. As a result, our approach leads to less conservative, smooth and intuitive human-aware navigation with reduced risk of collisions, and shows a good balance between task efficiency, collision avoidance and human comfort. To show its effectiveness, we compare our approach against the state of the art in crowded simulated environments, and with real-world human motion data from the THOR dataset. This comparison shows that we are able to improve task efficiency, keep a larger distance to humans and significantly reduce the collision time, when navigating in cluttered dynamic environments. Furthermore, the method is shown to work robustly with different state-of-the-art human motion predictors.
|
| |
| 10:00-11:30, Paper MoAIP-03.5 | Add to My Program |
| A Novel Obstacle-Avoidance Solution with Non-Iterative Neural Controller for Joint-Constrained Redundant Manipulators |
|
| Li, Weibing | Sun Yat-Sen University |
| Yi, Zilian | Sun Yat-Sen University |
| Zou, Yanying | Sun Yat-Sen University |
| Wu, Haimei | Sun Yat-Sen University |
| Yang, Yang | Sun Yat-Sen University |
| Pan, Yongping | Sun Yat-Sen University |
Keywords: Collision Avoidance, Optimization and Optimal Control, Redundant Robots
Abstract: Obstacle avoidance (OA) and joint-limit avoidance (JLA) are essential for redundant manipulators to ensure safe and reliable robotic operations. One solution to OA and JLA is to incorporate the involved constraints into a quadratic programming (QP), by solving which OA and JLA can be achieved. There exist a few non-iterative solvers such as zeroing neural networks (ZNNs), which can solve each sampled QP problem using only one iteration, yet no solution is suitable for OA and JLA due to the absence of some derivative information. To tackle these issues, this paper proposes a novel solution with a non-iterative neural controller termed NCP-ZNN for joint-constrained redundant manipulators. Unlike iterative methods, the neural controller involving derivative information proposed in this paper possesses some positive features including non-iterative computing and convergence with time. In this paper, the reestablished OA-JLA scheme is first introduced. Then, the design details of the neural controller are presented. After that, some comparative simulations based on a PA10 robot and an experiment based on a Franka Emika Panda robot are conducted, demonstrating that the proposed neural controller is more competent in OA and JLA.
|
| |
| 10:00-11:30, Paper MoAIP-03.6 | Add to My Program |
| TTC4MCP: Monocular Collision Prediction Based on Self-Supervised TTC Estimation |
|
| Li, Changlin | Shanghai Jiao Tong University |
| Qian, Yeqiang | Shanghai Jiao Tong University |
| Sun, Cong | Shanghai Jiao Tong University |
| Yan, Weihao | Shanghai Jiao Tong University |
| Wang, Chunxiang | Shanghai Jiaotong University |
| Yang, Ming | Shanghai Jiao Tong University |
Keywords: Collision Avoidance, Computer Vision for Transportation, Deep Learning for Visual Perception
Abstract: Vision-based collision prediction for autonomous driving is a challenging task due to the dynamic movement of vehicles and diverse types of obstacles. Most existing methods rely on object detection algorithms, which only predict predefined collision targets, such as vehicles and pedestrians, and cannot anticipate emergencies caused by unknown obstacles. To address this limitation, we propose a novel approach using pixel-wise time-to-collision (TTC) estimation for monocular collision prediction (TTC4MCP). Our approach predicts TTC and optical flow from monocular images and identifies potential collision areas using feature clustering and motion analysis. To overcome the challenge of training TTC estimation models without ground truth data in new scenes, we propose a self-supervised TTC training method, enabling collision prediction in a wider range of scenarios. TTC4MCP is evaluated on multiple road conditions and demonstrates promising results in terms of accuracy and robustness.
|
| |
| 10:00-11:30, Paper MoAIP-03.7 | Add to My Program |
| DAMON: Dynamic Amorphous Obstacle Navigation Using Topological Manifold Learning and Variational Autoencoding |
|
| Dastider, Apan | University of Central Florida |
| Mingjie, Lin | University of Central Florida |
Keywords: Collision Avoidance, Deep Learning Methods, Motion and Path Planning
Abstract: DAMON leverages manifold learning and vari- ational autoencoding to achieve obstacle avoidance, allowing for motion planning through adaptive graph traversal in a pre-learned low-dimensional hierarchically-structured manifold graph that captures intricate motion dynamics between a robotic arm and its obstacles. This versatile and reusable approach is applicable to various collaboration scenarios. The primary advantage of DAMON is its ability to embed information in a low-dimensional graph, eliminating the need for repeated computation required by current sampling-based methods. As a result, it offers faster and more efficient motion planning with significantly lower computational overhead and memory footprint. In summary, DAMON is a breakthrough methodology that addresses the challenge of dynamic obstacle avoidance in robotic systems and offers a promising solution for safe and efficient human-robot collaboration. Our approach has been experimentally validated on a 7-DoF robotic manipulator in both simulation and physical settings. DAMON enables the robot to learn and generate skills for avoiding previously-unseen obstacles while achieving predefined objectives. We also optimize DAMON�s design parameters and performance using an analytical framework. Our approach outperforms mainstream methodologies, including RRT, RRT*, Dynamic RRT*, L2RRT, and MpNet, with 40% more trajectory smoothness and over 65% improved latency performance, on average.
|
| |
| 10:00-11:30, Paper MoAIP-03.8 | Add to My Program |
| Gatekeeper: Online Safety Verification and Control for Nonlinear Systems in Dynamic Environments |
|
| Agrawal, Devansh | University of Michigan |
| Chen, Ruichang | University of Michigan |
| Panagou, Dimitra | University of Michigan, Ann Arbor |
Keywords: Collision Avoidance, Motion and Path Planning
Abstract: This paper presents the gatekeeper algorithm, a real-time and computationally-lightweight method to ensure that nonlinear systems can operate safely in dynamic environments despite limited perception. Gatekeeper integrates with existing path planners and feedback controllers by introducing an additional verification step that ensures that proposed trajectories can be executed safely, despite nonlinear dynamics subject to bounded disturbances, input constraints and partial knowledge of the environment. Our key contribution is that (A) we propose an algorithm to recursively construct committed trajectories, and (B) we prove that tracking the committed trajectory ensures the system is safe for all time into the future. The method is demonstrated on a complicated firefighting mission in a dynamic environment, and compares against the state-of-the-art techniques for similar problems.
|
| |
| 10:00-11:30, Paper MoAIP-03.9 | Add to My Program |
| Combinatorial Disjunctive Constraints for Obstacle Avoidance in Path Planning |
|
| Garcia, Raul | Rice University |
| Hicks, Illya V. | Rice University |
| Huchette, Joey | Google Research |
Keywords: Collision Avoidance, Motion and Path Planning, Optimization and Optimal Control
Abstract: We present a new approach for modeling avoidance constraints in 2D environments, in which waypoints are assigned to obstacle-free polyhedral regions. Constraints of this form are often formulated as mixed-integer programming (MIP) problems employing big-M techniques - however, these are generally not the strongest formulations possible with respect to the MIP's convex relaxation (so called ideal formulations), potentially resulting in larger computational burden. We instead model obstacle avoidance as combinatorial disjunctive constraints and leverage the independent branching scheme to construct small, ideal formulations. As our approach requires a biclique cover for an associated graph, we exploit the structure of this class of graphs to develop a fast subroutine for obtaining biclique covers in polynomial time. We also contribute an open-source Julia library named ClutteredEnvPathOpt to facilitate computational experiments of MIP formulations for obstacle avoidance. Experiments have shown our formulation is more compact and remains competitive on a number of instances compared with standard big-M techniques, for which solvers possess highly optimized procedures.
|
| |
| 10:00-11:30, Paper MoAIP-03.10 | Add to My Program |
| Reachability-Aware Collision Avoidance for Tractor-Trailer System with Non-Linear MPC and Control Barrier Function |
|
| Tang, Yucheng | University of Applied Sciences Karlsruhe |
| Mamaev, Ilshat | Karlsruhe Institute of Technology |
| Qin, Jing | Karlsruhe University of Applied Sciences |
| Wurll, Christian | Karlsruhe University of Applied Sciences |
| Hein, Bj�rn | Karlsruhe University of Applied Sciences |
Keywords: Collision Avoidance, Optimization and Optimal Control, Nonholonomic Motion Planning
Abstract: This paper proposes a reachability-aware model predictive control with a discrete control barrier function for backward obstacle avoidance for a tractor-trailer system. The framework incorporates the state-variant reachable set obtained through sampling-based reachability analysis and symbolic regression into the objective function of model predictive control. By optimizing the intersection of the reachable set and iterative non-safe region generated by the control barrier function, the system demonstrates better performance in terms of safety with a constant decay rate, while enhancing the feasibility of the optimization problem. The proposed algorithm improves real-time performance due to a shorter horizon and outperforms the state-of-the-art algorithms in the simulation environment and on a real robot.
|
| |
| 10:00-11:30, Paper MoAIP-03.11 | Add to My Program |
| Continuous Implicit SDF Based Any-Shape Robot Trajectory Optimization |
|
| Zhang, Tingrui | Zhejiang University |
| Wang, Jingping | Zhejiang University |
| Xu, Chao | Zhejiang University |
| Gao, Alan | Fan'gang |
| Gao, Fei | Zhejiang University |
Keywords: Collision Avoidance, Whole-Body Motion Planning and Control, Motion and Path Planning
Abstract: Optimization-based trajectory generation methods are widely used in whole-body planning for robots. However, existing work either oversimplifies the robot�s geometry and environment representation, resulting in a conservative trajectory or suffers from a huge overhead in maintaining additional information such as the Signed Distance Field (SDF). To bridge the gap, we consider the robot as an implicit function, with its surface boundary represented by the zero-level set of its SDF. We further employ another implicit function to lazily compute the signed distance to the swept volume generated by the robot and its trajectory. The computation is efficient by exploiting continuity in space-time, and the implicit function guarantees precise and continuous collision evaluation even for nonconvex robots with complex surfaces. We also propose a trajectory optimization pipeline applicable to the implicit SDF. Simulation and real-world experiments validate the high performance of our approach for arbitrarily shaped robot trajectory optimization. The code will be released at https://github.com/ZJU-FAST-Lab/Implicit-SDF-Planner.
|
| |
| 10:00-11:30, Paper MoAIP-03.12 | Add to My Program |
| Robo-Centric ESDF: A Fast and Accurate Whole-Body Collision Evaluation Tool for Any-Shape Robotic Planning |
|
| Geng, Shuang | Zhejiang University |
| Wang, Qianhao | Zhejiang University |
| Xie, Lei | State Key Laboratory of Industrial Control Technology, Zhejiang |
| Xu, Chao | Zhejiang University |
| Cao, Yanjun | Zhejiang University, Huzhou Institute of Zhejiang University |
| Gao, Fei | Zhejiang University |
Keywords: Collision Avoidance, Motion and Path Planning
Abstract: For letting mobile robots travel flexibly through complicated environments, increasing attention has been paid to the whole-body collision evaluation. Most existing works either opt for the conservative corridor-based methods that impose strict requirements on the corridor generation, or ESDF-based methods that suffer from high computational overhead. It is still a great challenge to achieve fast and accurate whole-body collision evaluation. In this paper, we propose a Robo-centric ESDF (RC-ESDF) that is pre-built in the robot body frame and is capable of seamlessly applied to any-shape mobile robots, even for those with non-convex shapes. RC-ESDF enjoys lazy collision evaluation, which retains only the minimum information sufficient for whole-body safety constraint and significantly speed up trajectory optimization. Based on the analytical gradients provided by RC-ESDF, we optimize the position and rotation of robot jointly, with whole-body safety, smoothness, and dynamical feasibility taken into account. Extensive simulation and real-world experiments verified the reliability and generalizability of our method.
|
| |
| 10:00-11:30, Paper MoAIP-03.13 | Add to My Program |
| Global Map Assisted Multi-Agent Collision Avoidance Via Deep Reinforcement Learning Around Complex Obstacles |
|
| Du, Yuanyuan | Cuhk, Sz |
| Zhang, Jianan | Peking University |
| Xu, Jie | Cush, Sz |
| Cheng, Xiang | Pku |
| Cui, Shuguang | Cush, Sz |
Keywords: Collision Avoidance, Motion and Path Planning, Reinforcement Learning
Abstract: State-of-the-art multi-agent collision avoidance algorithms face limitations when applied to cluttered public environments, where obstacles may have a variety of shapes and structures. The issue arises because most of these algorithms are agent-level methods. They concentrate solely on preventing collisions between the agents while the obstacles are handled merely out-of-policy. Obstacle-aware policies output an action considering both agents and obstacles. Current obstacle-aware algorithms, mainly based on Lidar sensor data, struggle to handle collision avoidance around complex obstacles. To resolve this issue, this paper investigates how to find a better way to travel around diverse obstacles. In particular, we present a global map assisted collision avoidance algorithm which, following the lead of a high-level goal guide and using an obstacle representation called distance map, considers other agents and obstacles simultaneously. Moreover, our model can be loaded into each agent individually, making it applicable to large maps or more agents. Simulation results indicate that our model outperforms the state-of-the-art algorithms, showing in scenarios with complex obstacles. We present a notion for incorporating global information in decentralized decision-making, along with a method for extending agent-level algorithms to adjust to cluttered environments in real-world scenarios.
|
| |
| MoAIP-04 Regular session, Hall E |
Add to My Program |
| Clone of 'Control Applications' |
|
| |
| |
| 10:00-11:30, Paper MoAIP-04.1 | Add to My Program |
| A Geometric Sufficient Condition for Contact Wrench Feasibility |
|
| Li, Shenggao | University of Notre Dame |
| Chen, Hua | Southern University of Science and Technology |
| Zhang, Wei | Southern University of Science and Technology |
| Wensing, Patrick M. | University of Notre Dame |
Keywords: Body Balancing, Humanoid and Bipedal Locomotion, Whole-Body Motion Planning and Control
Abstract: A fundamental problem in legged locomotion is to verify whether a desired trajectory satisfies all physical constraints, especially those for maintaining the contacts. Although foot tipping can be avoided via the Zero Moment Point (ZMP) condition, preventing foot sliding and twisting leads to the more complex Contact Wrench Cone (CWC) constraints. This paper proposes an efficient algorithm to certify the inclusion of a net contact wrench in the CWC on flat ground with uniform friction. In addition to checking the ZMP criteria, the proposed method also verifies whether the linear force and the yaw moment are feasible. The key step in the algorithm is a new exact geometric characterization of the yaw moment limits in the case when the support polygon is approximated by a single supporting line. We propose two approaches to select this approximating line, providing an accurate inner approximation of the ground truth yaw moment limits with only 18.80% (resp. 7.13%) error. The methods require only 1/150 (resp. 1/139) computation time compared to the exact CWC method based on conic programming. As a benchmark, approximating the CWC using square friction pyramids requires similar computation times as the exact CWC, but has > 19.35% error. Unlike the ZMP condition, our method provides a sufficient condition for contact wrench feasibility.
|
| |
| 10:00-11:30, Paper MoAIP-04.2 | Add to My Program |
| Aggregating Single-Wheeled Mobile Robots for Omnidirectional Movements |
|
| Wang, Meng | Beijing Institute for General Artificial Intelligence |
| Su, Yao | Beijing Institute for General Artificial Intelligence |
| Li, Hang | Beijing Institute for General Artificial Intelligence |
| Li, Jiarui | Peking University |
| Liang, Jixaing | Beihang University |
| Liu, Hangxin | Beijing Institute for General Artificial Intelligence (BIGAI) |
Keywords: Education Robotics, Art and Entertainment Robotics
Abstract: This paper presents a novel modular robot system that can self-reconfigure to achieve omnidirectional movements for collaborative object transportation. Each robotic module is equipped with a steerable omni-wheel for navigation and is shaped as a regular icositetragon with a permanent magnet installed on each corner for stable docking. After aggregating multiple modules and forming a structure that can cage a target object, we have developed an optimization-based method to compute the distribution of all wheels' heading directions, which enables efficient omnidirectional movements of the structure. By implementing a hierarchical controller on our prototyped system in both simulation and experiment, we validated the trajectory-tracking performance of an individual module and a team of six modules in multiple navigation and collaborative object transportation setting. The results demonstrate that the proposed system can maintain a stable caging formation and achieve smooth transportation, indicating the effectiveness of our hardware and locomotion designs.
|
| |
| 10:00-11:30, Paper MoAIP-04.3 | Add to My Program |
| An On-Wall-Rotating Strategy for Effective Upstream Motion of Untethered Millirobot: Principle, Design and Demonstration (I) |
|
| Yang, Liu | City University of Hong Kong |
| Zhang, Tieshan | City University of Hong Kong |
| Huang, Han | City University of Hong Kong |
| Ren, Hao | City University of Hongkong |
| Shang, Wanfeng | Shenzhen Institutes of Advanced Technology, Chinese Academy of S |
| Shen, Yajing | The Hong Kong University of Science and Technology |
Keywords: on-wall-rotating, Medical Robots and Systems, Modeling, Control, and Learning for Soft Robots, Micro/Nano Robots
Abstract: Untethered miniature robots that can access narrow and harsh environments in the body show great potential for future biomedical applications. Despite many types of millirobot have been developed, swimming against the fast blood flow remains a big challenge due to the low staying still ability of the robot and the large hydraulic resistance from blood. This work proposes an on-wall-rotating strategy and a streamlined millirobot to achieve the effective upstream motion in the lumen. First, the principle of on-wall-rotating strategy and the dynamic motion model of the millirobot is established. Then, a critical safety angle θs is theoretically and experimentally analyzed for the safe and stable control of the robot. After that, a series of experiment are conducted to verify the proposed driving strategy. The resutls suggest that the robot is able to move at speed of 5 mm/s against flow velocity of 138 mm/s, which is comparable to the blood flow of 2700 mm3 /s and several times faster than other reported driving strategies. This work offers a new strategy for the untethered magnetic robot construction and control for blood vessels, which would promote the application of millirobot for biomedical engineering.
|
| |
| 10:00-11:30, Paper MoAIP-04.4 | Add to My Program |
| Smooth Stride Length Change of Rat Robot with a Compliant Actuated Spine Based on CPG Controller |
|
| Huang, Yuhong | Technische Universit�t M�nchen |
| Bing, Zhenshan | Technical University of Munich |
| Zhang, Zitao | Sun Yat-Sen University |
| Huang, Kai | Sun Yat-Sen University |
| Morin, Fabrice O. | Technische Universit�t M�nchen |
| Knoll, Alois | Tech. Univ. Muenchen TUM |
Keywords: Robust/Adaptive Control, Motion Control, Biologically-Inspired Robots
Abstract: The aim of this research is to investigate the relationship between spinal flexion and quadruped locomotion in a rat robot equipped with a compliant spine, controlled by a central pattern generator (CPG). The study reveals that spinal flexion can enhance limb stride length, but it may also cause significant and unexpected motion disturbances during stride length variations. To address this issue, this paper proposes a CPG model driven by spinal flexion and a novel oscillator that incorporates a circular limit cycle and accounts for the anticipated stride length transition process. This approach effectively matches the torque change with the dynamics of stride length changes, leading to lower energy consumption. Extensive simulations are conducted to evaluate the efficacy of the proposed oscillator and compare it with the original kinetic model and other CPG models. The results demonstrate that the designed CPG model with the proposed oscillator yields smoother gait transitions during stride length variations and reduces energy consumption.
|
| |
| 10:00-11:30, Paper MoAIP-04.5 | Add to My Program |
| Learning Terrain-Adaptive Locomotion with Agile Behaviors by Imitating Animals |
|
| Li, Tingguang | The Chinese University of Hong Kong |
| Zhang, Yizheng | Tencent |
| Zhang, Chong | Tencent |
| Zhu, Qingxu | Tencent |
| Sheng, Jiapeng | Shandong University |
| Chi, Wanchao | Tencent |
| Zhou, Cheng | Tencent |
| Han, Lei | Tencent Robotics X |
Keywords: Machine Learning for Robot Control, Reinforcement Learning, AI-Based Methods
Abstract: In this paper, we present a general learning framework for controlling a quadruped robot that can mimic the behavior of real animals and traverse challenging terrains. Our method consists of two steps: an imitation learning step to learn from motions of real animals, and a terrain adaptation step to enable generalization to unseen terrains. We capture motions from a Labrador on various terrains to facilitate terrain adaptive locomotion. Our experiments demonstrate that our policy can traverse various terrains and produce a natural-looking behavior. We deployed our method on the real quadruped robot Max via zero-shot simulation-to-reality transfer, achieving a speed of 1.1 m/s on stairs climbing.
|
| |
| 10:00-11:30, Paper MoAIP-04.6 | Add to My Program |
| A Stable Adaptive Extended Kalman Filter for Estimating Robot Manipulators Link Velocity and Acceleration |
|
| Baradaran Birjandi, Seyed Ali | Technical University of Munich |
| Khurana, Harshit | EPFL |
| Billard, Aude | EPFL |
| Haddadin, Sami | Technical University of Munich |
Keywords: Sensor Fusion, Kinematics
Abstract: One can estimate the velocity and acceleration of robot manipulators by utilizing nonlinear observers. This involves combining inertial measurement units (IMUs) with the motor encoders of the robot through a model-based sensor fusion technique. This approach is lightweight, versatile (suitable for a wide range of trajectories and applications), and straightforward to implement. In order to further improve the estimation accuracy while running the system, we propose to adapt the noise information in this paper. This would automatically reduce the system vulnerability to imperfect modelings and sensor changes. Moreover, viable strategies to maintain the system stability are introduced. Finally, we thoroughly evaluate the overall framework with a seven DoF robot manipulator whose links are equipped with IMUs.
|
| |
| 10:00-11:30, Paper MoAIP-04.7 | Add to My Program |
| Provably Correct Sensor-Driven Path-Following for Unicycles Using Monotonic Score Functions |
|
| Clark, Benton | University of Kentucky |
| Hariprasad, Varun | Paul Laurence Dunbar High School |
| Poonawala, Hasan A. | University of Kentucky |
Keywords: Sensor-based Control, Autonomous Vehicle Navigation, Machine Learning for Robot Control
Abstract: This paper develops a provably stable sensor-driven controller for path-following applications of robots with unicycle kinematics, one specific class of which is the wheeled mobile robot (WMR). The sensor measurement is converted to a scalar value (the score) through some mapping (the score function); the latter may be designed or learned. The score is then mapped to forward and angular velocities using a simple rule with three parameters. The key contribution is that the correctness of this controller only relies on the score function satisfying monotonicity conditions with respect to the underlying state - local path coordinates - instead of achieving specific values at all states. The monotonicity conditions may be checked online by moving the WMR, without state estimation, or offline using a generative model of measurements such as in a simulator. Our approach provides both the practicality of a purely measurement-based control and the correctness of state-based guarantees. We demonstrate the effectiveness of this path-following approach on both a simulated and a physical WMR that use a learned score function derived from a binary classifier trained on real depth images.
|
| |
| 10:00-11:30, Paper MoAIP-04.8 | Add to My Program |
| Contact Reduction with Bounded Stiffness for Robust Sim-To-Real Transfer of Robot Assembly |
|
| Nghia, Vuong | Nanyang Technological University |
| Pham, Quang-Cuong | NTU Singapore |
Keywords: Simulation and Animation, Reinforcement Learning, Machine Learning for Robot Control
Abstract: In sim-to-real Reinforcement Learning (RL), a policy is trained in a simulated environment and then deployed on the physical system. The main challenge of sim-to-real RL is to overcome the emph{reality gap} - the discrepancies between the real world and its simulated counterpart. Using generic geometric representations, such as convex decomposition, triangular mesh, signed distance field can improve simulation fidelity, and thus potentially narrow the reality gap. Common to these approaches is that many contact points are generated for geometrically-complex objects, which slows down simulation and may cause numerical instability. Contact reduction methods address these issues by limiting the number of contact points, but the validity of these methods for sim-to-real RL has not been confirmed. In this paper, we present a contact reduction method with bounded stiffness to improve the simulation accuracy. Our experiments show that the proposed method critically enables training RL policy for a tight-clearance double pin insertion task and successfully deploying the policy on a rigid, position-controlled physical robot.
|
| |
| 10:00-11:30, Paper MoAIP-04.9 | Add to My Program |
| Trajectory Tracking Via Multiscale Continuous Attractor Networks |
|
| Joseph, Therese | Queensland University of Technology |
| Fischer, Tobias | Queensland University of Technology |
| Milford, Michael J | Queensland University of Technology |
Keywords: Neurorobotics, Cognitive Modeling
Abstract: Animals and insects showcase remarkably robust and adept navigational abilities, up to literally circumnavigating the globe. Primary progress in robotics inspired by these natural systems has occurred in two areas: highly theoretical computational neuroscience models, and handcrafted systems like RatSLAM and NeuroSLAM. In this research, we present work bridging the gap between the two, in the form of Multiscale Continuous Attractor Networks (MCAN), that combine the multiscale parallel spatial neural networks of the previous theoretical models with the real-world robustness of the robot-targeted systems, to enable trajectory tracking over large velocity ranges. To overcome the limitations of the reliance of previous systems on hand-tuned parameters, we present a genetic algorithm-based approach for automated tuning of these networks, substantially improving their usability. To provide challenging navigational scale ranges, we open source a flexible city-scale navigation simulator that adapts to any street network, enabling high throughput experimentation. In extensive experiments using the city-scale navigation environment and Kitti, we show that the system is capable of stable dead reckoning over a wide range of velocities and environmental scales, where a single-scale approach fails.
|
| |
| 10:00-11:30, Paper MoAIP-04.10 | Add to My Program |
| Design and Control of a Ballbot Drivetrain with High Agility, Minimal Footprint, and High Payload |
|
| Xiao, Chenzhang | University of Illinois at Urbana-Champaign |
| Mansouri, Mahshid | University of Illinois at Urbana-Champaign |
| Lam, David | University of Michigan - Ann Arbor |
| Ramos, Joao | University of Illinois at Urbana-Champaign |
| Hsiao-Wecksler, Elizabeth T. | University of Illinois at Urbana-Champaign |
Keywords: Body Balancing, Wheeled Robots, Underactuated Robots
Abstract: This paper presents the design and control of a ballbot drivetrain that aims to achieve high agility, minimal footprint, and high payload capacity while maintaining dynamic stability. Two hardware platforms and analytical models were developed to test design and control methodologies. The full-scale ballbot prototype (MiaPURE) was constructed using off-the-shelf components and designed to have agility, footprint, and balance similar to that of a walking human. The planar inverted pendulum testbed (PIPTB) was developed as a reduced-order testbed for quick validation of system performance. We then proposed a simple yet robust cascaded LQR-PI controller to balance and maneuver the ballbot drivetrain with a heavy payload. This is crucial because the drivetrain is often subject to high stiction due to elastomeric components in the torque transmission system. This controller was first tested in the PIPTB to compare with traditional LQR and cascaded PI-PD controllers, and then implemented in the ballbot drivetrain. The MiaPURE drivetrain was able to carry a payload of 60 kg, achieve a maximum speed of 2.3 m/s, and come to a stop from a speed of 1.4 m/s in 2 seconds in a selected translation direction. Finally, we demonstrated the omnidirectional movement of the ballbot drivetrain in an indoor environment as a payload-carrying robot and a human-riding mobility device. Our experiments demonstrated the feasibility of using the ballbot drivetrain as a universal mobility platform with agile movements, minimal footprint, and high payload capacity using our proposed design and control methodologies.
|
| |
| 10:00-11:30, Paper MoAIP-04.11 | Add to My Program |
| A Bayesian Reinforcement Learning Method for Periodic Robotic Control under Significant Uncertainty |
|
| Jia, Yuanyuan | Ritsumeikan University |
| Uriguen Eljuri, Pedro Miguel | Ritsumeikan University |
| Taniguchi, Tadahiro | Ritsumeikan University |
Keywords: Dexterous Manipulation, Medical Robots and Systems, Reinforcement Learning
Abstract: This paper addresses the lack of research on periodic reinforcement learning for physical robot control by presenting a 3-phase periodic Bayesian reinforcement learning method for uncertain environments. Drawing on cognition theory, the proposed approach achieves effective convergence with fewer training episodes. The coach-based demonstration phase narrows the search space and establishes a foundation for a coarse-to-fine control strategy. The reconnaissance phase enhances adaptability by discovering a valuable global representation, and the operation phase produces accurate robotic control by applying the learned representation and periodically updating local information. Comparative analysis with state-of-the-art methods validates the efficacy of our approach on exemplar control tasks in simulation and a biomedical project involving a simulated cranial window task.
|
| |
| 10:00-11:30, Paper MoAIP-04.12 | Add to My Program |
| Residual Physics Learning and System Identification for Sim-To-Real Transfer of Policies on Buoyancy Assisted Legged Robots |
|
| Sontakke, Nitish Rajnish | Georgia Institute of Technology |
| Chae, Hosik | University of California at Los Angeles |
| Lee, Sangjoon | University of California, Los Angeles |
| Huang, Tianle | Georgia Institute of Technology |
| Hong, Dennis | UCLA |
| Ha, Sehoon | Georgia Institute of Technology |
Keywords: Model Learning for Control, Reinforcement Learning, Legged Robots
Abstract: The light and soft characteristics of Buoyancy Assisted Lightweight Legged Unit (BALLU) robots have a great potential to provide intrinsically safe interactions in environments involving humans, unlike many heavy and rigid robots. However, their unique and sensitive dynamics impose challenges to obtaining robust control policies in the real world. In this work, we demonstrate robust sim-to-real transfer of control policies on the BALLU robots via system identification and our novel residual physics learning method, Environment Mimic (EnvMimic). First, we model the nonlinear dynamics of the actuators by collecting hardware data and optimizing the simulation parameters. Rather than relying on standard supervised learning formulations, we utilize deep reinforcement learning to train an external force policy to match real-world trajectories, which enables us to model residual physics with greater fidelity. We analyze the improved simulation fidelity by comparing the simulation trajectories against the real-world ones. We finally demonstrate that the improved simulator allows us to learn better walking and turning policies that can be successfully deployed on the hardware of BALLU.
|
| |
| 10:00-11:30, Paper MoAIP-04.13 | Add to My Program |
| DiffClothAI: Differentiable Cloth Simulation with Intersection-Free Frictional Contact and Differentiable Two-Way Coupling with Articulated Rigid Bodies |
|
| Yu, Xinyuan | National University of Singapore |
| Zhao, Siheng | Nanjing University |
| Luo, Siyuan | Xi'an Jiaotong University |
| Yang, Gang | National University of Singapore |
| Shao, Lin | National University of Singapore |
Keywords: Simulation and Animation, Optimization and Optimal Control
Abstract: Differentiable Simulations have recently proven useful for various robotic manipulation tasks, including cloth manipulation. In robotic cloth simulation, it is crucial to maintain intersection-free properties. We present DiffClothAI, a differentiable cloth simulation with intersection-free friction contact and two-way coupling with articulated rigid bodies. DiffClothAI integrates the Project Dynamics and Incremental Potential Contact coherently and proposes an effective method to derive gradients in the Cloth Simulation. It also establishes the differentiable coupling mechanism between articulated rigid bodies and cloth. We conduct a comprehensive evaluation of DiffClothAI�s effectiveness and accuracy and perform a variety of experiments in downstream robotic manipulation tasks. Supplemental materials and videos are available on our project webpage.
|
| |
| 10:00-11:30, Paper MoAIP-04.14 | Add to My Program |
| Walk-Burrow-Tug: Legged Anchoring Analysis Using RFT-Based Granular Limit Surfaces |
|
| Huh, Tae Myung | UC Berkeley |
| Cao, Cyndia | University of California Berkeley |
| Aderibigbe, Jadesola | University of California, Berkeley |
| Moon, Deaho | Korea Institute of Science and Technology |
| Stuart, Hannah | UC Berkeley |
Keywords: Contact Modeling, Legged Robots, Mobile Manipulation
Abstract: We develop a new resistive force theory based granular limit surface (RFT-GLS) method to predict and guide behaviors of forceful ground robots. As a case study, we harness a small mobile robotic system � MiniRQuad (296g) � to �walk-burrow-tug;� it actively exploits ground anchoring by burrowing its legs to tug loads. RFT-GLS informs the selection of efficient strategies to transport sleds with varying masses. The granular limit surface (GLS), a wrench boundary that separates stationary and kinetic behavior, is computed using 3D resistive force theory (RFT) for a given body and set of motion twists. This limit surface is then used to predict the quasi-static trajectory of the robot when it fails to withstand an external load. We find that the RFT-GLS enables accurate force and motion predictions in laboratory tests. For control applications, a pre-composed state space map of the twist-wrench pairs enables computationally efficient simulations to improve robotic anchoring strategies.
|
| |
| MoAIP-05 Regular session, Hall E |
Add to My Program |
| Clone of 'Mechanism Design I' |
|
| |
| |
| 10:00-11:30, Paper MoAIP-05.1 | Add to My Program |
| Tube Mechanism with 3-Axis Rotary Joints Structure to Achieve Variable Stiffness Using Positive Pressure |
|
| Onda, Issei | Tohoku University |
| Watanabe, Masahiro | Tohoku University |
| Tadakuma, Kenjiro | Tohoku University |
| Abe, Kazuki | Tohoku University |
| Tadokoro, Satoshi | Tohoku University |
Keywords: Mechanism Design, Hydraulic/Pneumatic Actuators, Flexible Robotics
Abstract: Studies on soft robotics have explored mechanisms for switching the stiffness of a robot structure. The hybrid soft-rigid approach, which combines soft materials and high-rigidity structures, is commonly used to achieve variable stiffness mechanisms. In particular, the positive-pressurization method has attracted significant attention in recent years as it can eliminate the constraints on driving pressure. Moreover, it can change the shape holding force according to internal pressure. In this study, a variable stiffness mechanism, comprising 3-axis rotary ball joints and a single chamber, was devised via frictional force using positive pressure. The prototype can change joint angles arbitrarily when no pressure is applied and can hold joint angles when positive pressure is applied. Using a theoretical model of the torque required to hold the joint angle, we simulated the holding torque using finite element modeling analysis and measured the holding torque in the pitch and roll directions when internal pressure was applied. Based on the interaction of the theoretical model, measurement, and FEM analysis, it was confirmed that the value of the holding torque in the roll direction was approximately π/2 times larger than that in the pitch direction for each value of the internal pressure. Further, we evaluated the FEM value, theoretical value, and measured value of the holding torque by performing pairwise numerical comparisons. Our approach will aid the design of effective stiffening mechanisms for soft robotics applications.
|
| |
| 10:00-11:30, Paper MoAIP-05.2 | Add to My Program |
| Timor Python: A Toolbox for Industrial Modular Robotics |
|
| K�lz, Jonathan | Technical University of Munich |
| Mayer, Matthias | Technical University of Munich |
| Althoff, Matthias | Technische Universit�t M�nchen |
Keywords: Cellular and Modular Robots, Methods and Tools for Robot System Design, Software Tools for Robot Programming
Abstract: Modular Reconfigurable Robots (MRRs) represent an exciting path forward for industrial robotics, opening up new possibilities for robot design. Compared to monolithic manipulators, they promise greater flexibility, improved maintainability, and cost-efficiency. However, there is no tool or standardized way to model and simulate assemblies of modules in the same way it has been done for robotic manipulators for decades. We introduce the Toolbox for Industrial Modular Robotics (Timor), a Python toolbox to bridge this gap and integrate modular robotics into existing simulation and optimization pipelines. Our open-source library offers model generation and task-based configuration optimization for MRRs. It can easily be integrated with existing simulation tools � not least by offering URDF export of arbitrary modular robot assemblies. Moreover, our experimental study demonstrates the effectiveness of Timor as a tool for designing modular robots optimized for specific use cases.
|
| |
| 10:00-11:30, Paper MoAIP-05.3 | Add to My Program |
| Ultra-Low Inertia 6-DOF Manipulator Arm for Touching the World |
|
| Nishii, Kazutoshi | Toyota Motor Corporation |
| Okumatsu, Yohishiro | Toyota Motor Corporation |
| Hatano, Akira | Toyota Motor Corporation |
Keywords: Mechanism Design, Tendon/Wire Mechanism
Abstract: As robotic intelligence increases, so does the importance of agents that collect data from real-world environments. When learning in contact with the environment, one must consider how to minimize the impact on the environment and maintain reproducibility. To achieve this, the contact force with the environment must be reduced. One way to achieve this is to reduce the inertia of the arm. In this study, we present an arm we have developed with 6 degrees of freedom and low inertia. The inertia of our arm has been significantly reduced compared to previous research, and experiments have confirmed that it also has low joint friction torque and good contact sensitivity.
|
| |
| 10:00-11:30, Paper MoAIP-05.4 | Add to My Program |
| Determination of the Characteristics of Gears of Robot-Like Systems by Analytical Description of Their Structure |
|
| Landler, Stefan | Technical University of Munich |
| Molina Blanco, Ra�l | Technical University of Munich |
| Otto, Michael | Technical University of Munich, Chair of Machine Elements, Gear |
| Vogel-Heuser, Birgit | Technical University Munich |
| Zimmermann, Markus | Technical University of Munich |
| Stahl, Karsten | Technical University of Munich |
Keywords: Methods and Tools for Robot System Design, Product Design, Development and Prototyping, Engineering for Robotic Systems
Abstract: The axes of robots and robot-like systems (RLS) usually include e-motor-gearbox-arrangements for optimal connection of the elements. The characteristics of the drive system and thus also of the robot depend strongly on the gears. Different gearbox designs are available which differ in stiffness, efficiency and further properties. For an application-optimal design of RLS a uniform documentation and a comparability of gearbox concepts is a decisive factor. The application-optimal design is supported by an interdisciplinary approach between mechanical engineering and software design, guided by adequate product development methodology. The quite heterogeneous characterization of gearboxes for RLS which is currently the state of the art is a relevant obstacle in the flexible and optimal design of RLS. The paper shows the analysis of the gear structure with unified symbols for specific machine elements and contact types. The introduced method gives insight into the mechanical structure of the gearboxes. Similarities between gear types can thus be revealed. This also enables the classification of new developments in the state of the art. Moreover, the developed method for analyzing the gear structure can be used to determine the characteristics of gears. Examples for these characteristics are backlash, efficiency or stiffness. Specifically, the stiffness of gears can be synthesized by the force action of individual contacts and the individual phenomena that occur with them. The representation by individual phenomena also makes it possible to extend the calculation to include influencing parameters such as temperature that have not been sufficiently taken into account so far.
|
| |
| 10:00-11:30, Paper MoAIP-05.5 | Add to My Program |
| Tension Jamming for Deployable Structures |
|
| Hasegawa, Daniel | Harvard University |
| Aktas, Buse | ETH Zurich |
| Howe, Robert D. | Harvard University |
Keywords: Mechanism Design, Compliant Joints and Mechanisms, Soft Robot Materials and Design
Abstract: Deployable structures provide adaptability and versatility for applications such as temporary architectures, space structures, and biomedical devices. Jamming is a mechanical phenomenon with which dramatic changes in stiffness can be achieved by increasing the frictional and kinematic coupling between constituents in a structure by applying an external pressure. This study applies jamming, which has been primarily used in medium-scale soft robotics applications to large-scale deployable structures with components that are soft and compact during transport, but rigid upon deployment. It proposes a new jamming structure with a novel built-in actuation mechanism which enables high-performance at large scales: a composite beam made of rectangular segments along a cable which can be pre-tensioned and thus jammed. Two theoretical models are developed to provide insights into the mechanical behavior of the composite beams and predict their performance under loading. A scale model of a deployable bridge is built using the tension-based composite beams, and the bridge is deployed and assembled by air with a drone demonstrating the versatility and viability of the proposed approach for robotics applications.
|
| |
| 10:00-11:30, Paper MoAIP-05.6 | Add to My Program |
| Task2Morph: Differentiable Task-Inspired Framework for Contact-Aware Robot Design |
|
| Cai, Yishuai | National University of Defense Technology |
| Yang, Shaowu | National University of Defense Technology |
| Li, Minglong | National University of Defense Technology |
| Chen, Xinglin | National University of Defense Technology |
| Mao, Yunxin | National University of Defense Technology |
| Yi, Xiaodong | National University of Defense Technology |
| Yang, Wenjing | State Key Laboratory of High Performance Computing (HPCL), Schoo |
Keywords: Evolutionary Robotics, AI-Enabled Robotics
Abstract: Optimizing the morphologies and the controllers that adapt to various tasks is a critical issue in the field of robot design, aka. embodied intelligence. Previous works typically model it as a joint optimization problem and use search-based methods to find the optimal solution in the morphology space. However, they ignore the implicit knowledge of task-to-morphology mapping which can directly inspire robot design. For example, flipping heavier boxes tends to require more muscular robot arms. This paper proposes a novel and general differentiable task-inspired framework for contact-aware robot design called Task2Morph. We abstract task features highly related to task performance and use them to build a task-to-morphology mapping. Further, we embed the mapping into a differentiable robot design process, where the gradient information is leveraged for both the mapping learning and the whole optimization. The experiments are conducted on three scenarios, and the results validate that Task2Morph outperforms DiffHand, which lacks a task-inspired morphology module, in terms of efficiency and effectiveness.
|
| |
| 10:00-11:30, Paper MoAIP-05.7 | Add to My Program |
| Constraint Programming for Component-Level Robot Design |
|
| Wilhelm, Andrew | Cornell University |
| Napp, Nils | Cornell University |
Keywords: Methods and Tools for Robot System Design, Formal Methods in Robotics and Automation, Product Design, Development and Prototyping
Abstract: Effective design automation for building robots would make development faster and easier while also less prone to design errors. However, complex multi-domain constraints make creating such tools difficult. One persistent challenge in achieving this goal of design automation is the fundamental problem of component selection, an optimization problem where, given a general robot model, components must be selected from a possibly large set of catalogs to minimize design objectives while meeting target specifications. Different approaches to this problem have used Monotone Co-Design Problems (MCDPs) or linear and quadratic programming, but these require judicious system approximations that affect the accuracy of the solution. We take an alternative approach formulating the component selection problem as a combinatorial optimization problem, which does not require any system approximations, and using constraint programming (CP) to solve this problem with a depth-first branch-and-bound algorithm. As the efficacy of CP critically depends upon the orderings of variables and their domain values, we present two heuristics specific to the problem of component selection that significantly improve solve time compared to traditional constraint satisfaction programming heuristics. We also add redundant constraints to the optimization problem to further improve run time by evaluating certain global constraints before all relevant variables are assigned. We demonstrate that our CP approach can find optimal solutions from over 20 trillion candidate solutions in only seconds, up to 48 times faster than an MCDP approach solving the same problem. Finally, for three different robot designs we build the corresponding robots to physically validate that the selected components meet the target design specifications.
|
| |
| 10:00-11:30, Paper MoAIP-05.8 | Add to My Program |
| Design and Implementation of a Two-Limbed 3T1R Haptic Device |
|
| Kang, Long | Nanjing University of Science and Technology |
| Yang, Yang | Nanjing University of Information Science and Technology |
| Yi, Byung-Ju | Hanyang University |
Keywords: Mechanism Design, Haptics and Haptic Interfaces, Parallel Robots
Abstract: This paper presents a haptic device with a simple architecture of only two limbs that can provide translational motion in three degrees of freedom (DOF) and one-DOF rotational motion. Actuation redundancy eliminates all forward-kinematic singularities and improves the motion-force transmission property. Thanks to the special structure of the kinematic chains, all actuators are close to the base and full gravity compensation is achieved passively by using springs. Force producibility analysis shows that this haptic device is able to produce long-term continuous force feedback of 15�30 N in each direction. By developing a prototype of the haptic device and a virtual three-dimensional simulator, a preliminary performance evaluation of the haptic device was conducted. In addition, a torque distribution algorithm considering a relaxed form of actuator-torque saturation was experimentally evaluated, and a comparison with other algorithms reveals that this algorithm offers several advantages.
|
| |
| 10:00-11:30, Paper MoAIP-05.9 | Add to My Program |
| Combining Measurement Uncertainties with the Probabilistic Robustness for Safety Evaluation of Robot Systems |
|
| Baek, Woo-Jeong | Karlsruhe Institute of Technology (KIT) |
| Ledermann, Christoph | Karlsruhe Institute of Technology |
| Asfour, Tamim | Karlsruhe Institute of Technology (KIT) |
| Kroeger, Torsten | Karlsruher Institut F�r Technologie (KIT) |
Keywords: Methods and Tools for Robot System Design, Robot Safety, Probability and Statistical Methods
Abstract: In this paper, we present a method to engage measurement uncertainties with the probabilistic robustness to one system uncertainty measure. Providing a metric indicating the potential occurrence of dangerous situations is highly essential for safety-critical robot applications. Due to the difficulty of finding a quantifiable, unambiguous representation however, such a metric has not been derived to date. In case of sensory devices, measurement uncertainties are usually provided by manufacturer specifications. Apart from that, several contributions demonstrate that the accuracy of neural networks is verifiable via the robustness. However, state-of-the-art literature is mainly concerned with theoretical investigations such that scarce attention has been devoted to the transfer of the robustness to real-world applications. To fill this gap, we show how the probabilistic robustness can be made useful for evaluating quantitative safety limits. Our key idea is to exploit the analogy between measurement uncertainties and the probabilistic robustness: While measurement uncertainties reflect possible shifts due to technical limitations, the robustness refers to the tolerated amount of distortions in the input data for an unaltered output. Inspired by this analogy, we combine both measures to quantify the system uncertainty online. We validate our method in different settings under real-world conditions. Our findings exemplify that incorporating the novel uncertainty metric effectively prevents the rate of dangerous situations in Human-Robot Collaboration.
|
| |
| 10:00-11:30, Paper MoAIP-05.10 | Add to My Program |
| Computational Design of Closed-Chain Linkages: Respawn Algorithm for Generative Design |
|
| Ivolga, Dmitriy | ITMO University |
| Nasonov, Kirill | ITMO University |
| Borisov, Ivan | ITMO University |
| Kolyubin, Sergey | ITMO University |
Keywords: Mechanism Design, Legged Robots, Grippers and Other End-Effectors
Abstract: Designing robots is a multiphase process aimed at solving a multi-criteria optimization problem to find the best possible detailed design. Generative design (GD) aims to accelerate the design process compared to manual design, since GD allows exploring and exploiting the vast design space more efficiently. In the field of robotics, however, relevant research focuses mostly on the generation of fully-actuated open chain kinematics, which is trivial in mechanical engineering perspective. Within this paper, we address the problem of generative design of closed-chain linkage mechanisms. A GD algorithm has to be able to generate meaningful mechanisms which satisfy conditions of existence. We propose an optimization-driven algorithm for generation of planar closed-chain linkages to follow a predefined trajectory. The algorithm creates an unlimited range of physically reproducible design alternatives that can be further tested in simulation. These tests could be done in order to find solutions that satisfy extra criteria, e.g., desired dynamic behavior or low energy consumption. The proposed algorithm is called "respawn" since it builds a new linkage after the ancestor has been tested in a virtual environment in pursuit for the optimal solution. To show that the algorithm is general enough, we show a set of generated linkages that can be used for a wide class of robots.
|
| |
| 10:00-11:30, Paper MoAIP-05.11 | Add to My Program |
| On Designing a Learning Robot: Improving Morphology for Enhanced Task Performance and Learning |
|
| Sorokin, Maks | Georgia Institute of Technology |
| Fu, Chuyuan | X, the Moonshot Factory |
| Tan, Jie | Google |
| Liu, Karen | Stanford University |
| Bai, Yunfei | Google X |
| Lu, Wenlong | Everyday Robots, X the Moonshot Factory |
| Ha, Sehoon | Georgia Institute of Technology |
| Khansari, Mohi | Google X |
Keywords: Mechanism Design, Visual Learning, Evolutionary Robotics
Abstract: As robots become more prevalent, optimizing their design for better performance and efficiency is becoming increasingly important. However, current robot design practices overlook the impact of perception and design choices on a robot's learning capabilities. To address this gap, we propose a comprehensive methodology that accounts for the interplay between the robot's perception, hardware characteristics, and task requirements. Our approach optimizes the robot's morphology holistically, leading to improved learning and task execution proficiency. To achieve this, we introduce a Morphology-AGnostIc Controller (MAGIC), which helps with the rapid assessment of different robot designs. The MAGIC policy is efficiently trained through a novel PRIvileged Single-stage learning via latent alignMent (PRISM) framework, which also encourages behaviors that are typical of robot onboard observation. Our simulation-based results demonstrate that morphologies optimized holistically improve the robot performance by 15-20% on various manipulation tasks, and require 25x less data to match human-expert made morphology performance. In summary, our work contributes to the growing trend of learning-based approaches in robotics and emphasizes the potential in designing robots that facilitate better learning.
|
| |
| 10:00-11:30, Paper MoAIP-05.12 | Add to My Program |
| Development of a Dynamic Quadruped with Tunable, Compliant Legs |
|
| Chen, Fuchen | Arizona State University |
| Tao, Weijia | Arizona State University |
| Aukes, Daniel | Arizona State University |
Keywords: Mechanism Design, Compliant Joints and Mechanisms, Legged Robots
Abstract: To facilitate the study of how passive leg stiffness influences locomotion dynamics and performance, we have developed an affordable and accessible 400 g quadruped robot driven by tunable compliant laminate legs, whose series and parallel stiffness can be easily adjusted; fabrication only takes 2.5 hours for all four legs. The robot can trot at 0.52 m/s or 4.4 body lengths per second with a 3.2 cost of transport (COT). Through locomotion experiments in both the real world and simulation we demonstrate that legs with different stiffness have an obvious impact on the robot�s average speed, COT, and pronking height. When the robot is trotting at 4 Hz in the real world, changing the leg stiffness yields a maximum improvement of 37.1% in speed and 62.0% in COT, showing its great potential for future research on locomotion controller designs and leg stiffness optimizations.
|
| |
| 10:00-11:30, Paper MoAIP-05.13 | Add to My Program |
| A Passive Compliance Obstacle Crossing Robot for Power Line Inspection and Maintenance |
|
| Chen, Minghao | Institute of Automation, Chinese Academy of Sciences |
| Cao, Yinghua | Institute of Automation,Chinese Academy of Sciences |
| Tian, Yunong | Institute of Automation, Chinese Academy of Sciences |
| Li, En | Institute of Automation, Chinese Academy of Sciences |
| Liang, Zize | Institute of Automation, Chinese Academy of Sciences |
| Tan, Min | Institute of Automation, Chinese Academy of Sciences |
Keywords: Mechanism Design, Industrial Robots, Engineering for Robotic Systems
Abstract: In scenarios of the overhead power line system, manual methods are inefficient and unsafe. Meanwhile, the majority of cantilevered robots have poor efficiency when crossing obstacles. This paper proposes a novel power line inspection and maintenance robot to solve these problems. The robot employs a passive compliance obstacle-crossing principle, which could rapidly cross obstacles with the cooperation of gas springs and climbing wheels. Under high payload, the robot could take 5-15 seconds without any complex strategies to roll over obstacles. A variable configuration platform is also designed, which has a multiple line mode and a single line mode. It makes the robot suitable for different kinds of overhead power lines. Meanwhile, the related adaptability analyses are presented. Manipulators are also installed to help the robot perform specific maintenance tasks. The results of lab experiments and field tests reveal that the robot could stably and rapidly cross obstacles, such as suspension clamps, vibration dampers, and spacers, and could perform three kinds of maintenance tasks on the line.
|
| |
| 10:00-11:30, Paper MoAIP-05.14 | Add to My Program |
| Open Robot Hardware: Progress, Benefits, Challenges, and Best Practices (I) |
|
| Patel, Vatsal | Yale University |
| Liarokapis, Minas | The University of Auckland |
| Dollar, Aaron | Yale University |
Keywords: Methods and Tools for Robot System Design, Product Design, Development and Prototyping, Mechanism Design
Abstract: Open-source projects have seen widespread adoption and improved availability in robotics over recent years. The rapid pace of progress in robotics is in part fueled by open-source projects, allowing researchers to implement novel ideas and approaches quickly. Open-source hardware in particular lowers the barrier of entry to new technologies, and can further accelerate innovation in robotics. But it is also more difficult to propagate in comparison to software because it requires replicating physical components. We present a review on Open Robot Hardware (ORH), by first highlighting key benefits and challenges encountered by users and developers of ORH, and relaying some best practices that can be adopted in developing an ORH. Then, we survey over 60 major ORH works in the different domains within robotics. Lastly, we identify strategies exemplified by the surveyed works to further detail the development process and guide developers through the design, documentation, and dissemination stages of an ORH project.
|
| |
| MoAIP-06 Regular session, Hall E |
Add to My Program |
| Clone of 'Modeling, Control, and Learning for Soft Robots I' |
|
| |
| |
| 10:00-11:30, Paper MoAIP-06.1 | Add to My Program |
| Modelling of Tendon Driven Robot Based on Constraint Analysis and Pseudo-Rigid Body Model |
|
| Troeung, Charles | Monash University |
| Liu, Shaotong | Monash University |
| Chen, Chao | Monash University |
Keywords: Modeling, Control, and Learning for Soft Robots, Tendon/Wire Mechanism, Soft Robot Applications
Abstract: Quasi-static models of tendon-driven continuum robots (TDCR) require consideration of both the kinematic and static conditions simultaneously. While the Pseudo-Rigid Body (PRB-3R) model has been demonstrated to be efficient, existing works ignore the mechanical effect of the tendons such as elongation. In addition, the static equilibrium equations for the partially constrained tendons have been expressed in different forms within the literature. This leads to inconsistent simulation results which have not been validated by experimental data when external loads are applied. Furthermore, the inverse problem for solving the required inputs for a prescribed end effector pose has not been studied for the PRB-3R model. In this work, we introduce a new modelling approach based on constraint analysis (CA) of a multi-body system and Lagrange multipliers to systematically derive all the relevant governing equations required for a planar TDCR. This method can include tendon mechanics and efficiently solve for the direct and inverse kinetostatic models with either forces or displacements as the actuation inputs. We validate the proposed CA method using numerical simulation of a benchmark model and experimental data.
|
| |
| 10:00-11:30, Paper MoAIP-06.2 | Add to My Program |
| An Improved Koopman-MPC Framework for Data-Driven Modeling and Control of Soft Actuators |
|
| Wang, Jiajin | Southeast University |
| Xu, Baoguo | Southeast University |
| Lai, Jianwei | Southeast University |
| Wang, Yifei | Southeast University |
| Hu, Cong | Guilin University of Electronic Technology |
| Li, Huijun | Southeast University |
| Song, Aiguo | Southeast University |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Sensors and Actuators
Abstract: The challenge of achieving precise control of soft actuators with strong nonlinearity is mainly due to the difficulty of deriving models suitable for model-based control techniques. Fortunately, Koopman operator provides a data-driven method for constructing control-oriented models of nonlinear systems to achieve model predictive control (MPC). It is called the Koopman-MPC framework, which is theoretically effective for soft actuators. Nevertheless, in this framework, a critical challenge is to select correct basis functions for Koopman-based modeling. Furthermore, there is room for improvement in control performance. To overcome these problems, this letter presents an improved Koopman-MPC framework to efficiently implement model-based control techniques for soft actuators. Firstly, we propose a systematic method for selecting the basis functions, which extends the measurement coordinates with derivative and time-delay coordinates and uses the spares identification of nonlinear dynamics (SINDy) algorithm. Secondly, an incremental model predictive control with dynamic constraints (IMPCDC) is developed based on the Koopman model. Finally, several comparative experiments are conducted to verify the utility of the improved Koopman-MPC framework for data-driven modeling and control of soft actuators.
|
| |
| 10:00-11:30, Paper MoAIP-06.3 | Add to My Program |
| Soft Robot Shape Estimation: A Load-Agnostic Geometric Method |
|
| Sorensen, Christian | Brigham Young University |
| Killpack, Marc | Brigham Young University |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Sensors and Actuators, Soft Robot Applications
Abstract: In this paper we present a novel kinematic representation of a soft continuum robot to enable full shape estimation using a purely geometric solution. The kinematic representation involves using length varying piecewise constant curvature segments to describe the deformed shape of the robot. Based on this kinematic representation, we can use overlapping length sensors to estimate the shape of continuously deformable bodies without prior knowledge of the current loading conditions. We show an implementation that assumes one change in curvature along the length of a joint, using string potentiometers as an arc length sensor, and an orientation measurement from the tip of the continuum joint. For 56 randomized joint configurations, we estimate the shape of a 250 mm long continually deformable robot with less then 2.5 mm of average error. The average error is reported for each of the 10 different equally spaced points along the length, demonstrating the ability to accurately represent the full shape of the soft robot.
|
| |
| 10:00-11:30, Paper MoAIP-06.4 | Add to My Program |
| Robust Generalized Proportional Integral Control for Trajectory Tracking of Soft Actuators in a Pediatric Wearable Assistive Device |
|
| Mucchiani, Caio | University of California Riverside |
| Liu, Zhichao | University of California, Riverside |
| Sahin, Ipsita | University of California, Riverside |
| Kokkoni, Elena | University of California, Riverside |
| Karydis, Konstantinos | University of California, Riverside |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Robot Applications, Wearable Robotics
Abstract: Soft robotics hold promise in the development of safe yet powered assistive wearable devices for infants. Key to this is the development of closed-loop controllers that can help regulate pneumatic pressure in the device's actuators in an effort to induce controlled motion at the user's limbs and be able to track different types of trajectories. This work develops a controller for soft pneumatic actuators aimed to power a pediatric soft wearable robotic device prototype for upper extremity motion assistance. The controller tracks desired trajectories for a system of soft pneumatic actuators supporting two-degree-of-freedom shoulder joint motion on an infant-sized engineered mannequin. The degrees of freedom assisted by the actuators are equivalent to shoulder motion (abduction/adduction and flexion/extension). Embedded inertial measurement unit sensors provide real-time joint feedback. Experimental data from performing reaching tasks using the engineered mannequin are obtained and compared against ground truth to evaluate the performance of the developed controller. Results reveal the proposed controller leads to accurate trajectory tracking performance across a variety of shoulder joint motions.
|
| |
| 10:00-11:30, Paper MoAIP-06.5 | Add to My Program |
| Data-Efficient Online Learning of Ball Placement in Robot Table Tennis |
|
| Tobuschat, Philip | Max Planck Institue for Intelligent Systems, T�bingen |
| Ma, Hao | Max Planck Institute for Intelligent Systems |
| B�chler, Dieter | Max Planck Institute for Intelligent Systems T�bingen |
| Sch�lkopf, Bernhard | Max Planck Institute for Intelligent Systems |
| Muehlebach, Michael | ETH |
Keywords: Modeling, Control, and Learning for Soft Robots, Bioinspired Robot Learning, Machine Learning for Robot Control
Abstract: We present an implementation of an online optimization algorithm for hitting a predefined target when returning ping-pong balls with a table tennis robot. The online algorithm optimizes over so-called interception policies, which define the manner in which the robot arm intercepts the ball. In our case, these are composed of the state of the robot arm (position and velocity) at interception time. Gradient information is provided to the optimization algorithm via the mapping from the interception policy to the landing point of the ball on the table, which is approximated with a black-box and a grey-box approach. Our algorithm is applied to a robotic arm with four degrees of freedom that is driven by pneumatic artificial muscles. As a result, the robot arm is able to return the ball onto any predefined target on the table after about 2-5 iterations. We highlight the robustness of our approach by showing rapid convergence with both the black-box and the grey-box gradients. In addition, the small number of iterations required to reach close proximity to the target also underlines the sample efficiency. A demonstration video can be found here: https://youtu.be/VC3KJoCss0k.
|
| |
| 10:00-11:30, Paper MoAIP-06.6 | Add to My Program |
| Learning Reduced-Order Soft Robot Controller |
|
| Liang, Chen | Zhejiang University |
| Gao, Xifeng | Tencent America |
| Wu, Kui | Tencent |
| Pan, Zherong | Tencent America |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Robot Applications, Optimization and Optimal Control
Abstract: Deformable robots are notoriously difficult to model or control due to its high-dimensional configuration spaces. Direct trajectory optimization suffers from the curse-of-dimensionality and incurs a high computational cost, while learning-based controller optimization methods are sensitive to hyper-parameter tuning. To overcome these limitations, we hypothesize that high fidelity soft robots can be both simulated and controlled by restricting to low-dimensional spaces. Under such assumption, we propose a two-stage algorithm to identify such simulation- and control-spaces. Our method first identifies the so-called simulation-space that captures the salient deformation modes, to which the robot's governing equation is restricted. We then identify the control-space, to which control signals are restricted. We propose a multi-fidelity Riemannian Bayesian bilevel optimization to identify task-specific control spaces. We show that the dimension of control-space can be less than 10 for a high-DOF soft robot to accomplish walking and swimming tasks, allowing low-dimensional MPC controllers to be applied to soft robots with tractable computational complexity.
|
| |
| 10:00-11:30, Paper MoAIP-06.7 | Add to My Program |
| A Single-Parameter Model for Soft Bellows Actuators under Axial Deformation and Loading |
|
| Treadway, Emma | Trinity University |
| Brei, Melissa | University of Michigan |
| Sedal, Audrey | McGill University |
| Gillespie, Brent | University of Michigan |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Sensors and Actuators, Hydraulic/Pneumatic Actuators
Abstract: Soft fluidic actuators are becoming popular for their backdrivability, potential for high power density, and their support for power supply through flexible tubes. Control and design of such actuators requires serviceable models that describe how they relate fluid pressure and flow to mechanical force and motion. We present a simple 2-port model of a bellows actuator that accounts for the relationships among fluid and mechanical variables imposed by the kinematics of the deforming bellows structure and accounts for elastic energy stored in the actuator�s thermoplastic material structure. Elastic energy storage due to axial deformation is captured by revolving a differential strip whose linear elastic behavior is a nonlinear function of the actuator length. The model is evaluated through experiments in which either actuator length and pressure or force and pressure are imposed. The model has an error of 9.8% of the force range explored and yields insight into the effects of geometry changes. The resulting model can be used for model-based control or actuator design across the full operating range and can be exercised under either imposed force or imposed actuator length.
|
| |
| 10:00-11:30, Paper MoAIP-06.8 | Add to My Program |
| Task and Configuration Space Compliance of Continuum Robots Via Lie Group and Modal Shape Formulations |
|
| Orekhov, Andrew | Carnegie Mellon University |
| Johnston, Garrison | Vanderbilt University |
| Simaan, Nabil | Vanderbilt University |
Keywords: Modeling, Control, and Learning for Soft Robots, Kinematics, Flexible Robotics
Abstract: Continuum robots suffer large deflections due to internal and external forces. Accurate modeling of their passive compliance is necessary for accurate environmental interaction, especially in scenarios where direct force sensing is not practical. This paper focuses on deriving analytic formulations for the compliance of continuum robots that can be modeled as Kirchhoff rods. Compared to prior works, the approach presented herein is not subject to the constant-curvature assumptions to derive the configuration space compliance, and we do not rely on computationally-expensive finite difference approximations to obtain the task space compliance. Using modal approximations over curvature space and Lie group integration, we obtain closed-form expressions for the task and configuration space compliance matrices of continuum robots, thereby bridging the gap between constant-curvature analytic formulations of configuration space compliance and variable curvature task space compliance. We first present an analytic expression for the compliance of a single Kirchhoff rod. We then extend this formulation for computing both the task space and configuration space compliance of a tendon-actuated continuum robot. We then use our formulation to study the tradeoffs between computation cost and modeling accuracy as well as the loss in accuracy from neglecting the Jacobian derivative term in the compliance model. Finally, we experimentally validate the model on a tendon-actuated continuum segment, demonstrating the model's ability to predict passive deflections with error below 11.5% percent of total arc length.
|
| |
| 10:00-11:30, Paper MoAIP-06.9 | Add to My Program |
| A Localization Framework for Boundary Constrained Soft Robots |
|
| Tanaka, Koki | Illinois Institute of Technology |
| Zhou, Qiyuan | Illinois Institute of Technology |
| Srivastava, Ankit | Illinois Institute of Technology |
| Spenko, Matthew | Illinois Institute of Technology |
Keywords: Modeling, Control, and Learning for Soft Robots, Localization, Soft Robot Applications
Abstract: Soft robots possess unique capabilities for adapting to the environment and interacting with it safely. However, their deformable nature also poses challenges for controlling their movement. In particular, the large deformations of a soft robot make it difficult to localize its individual body parts, which in turn impedes effective control. This paper introduces a novel localization framework designed for soft robots that are constrained by boundaries and benefit from unique hardware architecture. To this end, we propose a method that exploits the flexible boundaries of the robot to create an onboard sensor capable of measuring the relative distances between its sub-robots. This measurement data is incorporated into a linear Kalman filter for accurate localization. We evaluate the framework's performance in benchmark and dynamic cases and demonstrate its effectiveness in improving localization accuracy compared to an IMU-based approach. The results also show that the proposed method achieves sufficient localization accuracy for contact-based mapping, enabling the robot to sense the location of obstacles in the environment. Finally, we validate the proposed framework using a physical prototype of a boundary-constrained soft robot and demonstrate its ability to accurately estimate the robot's shape. This framework has the potential to enable soft robots to autonomously navigate and map unknown environments, which could be beneficial for a variety of exploration tasks.
|
| |
| 10:00-11:30, Paper MoAIP-06.10 | Add to My Program |
| EViper: A Scalable Platform for Untethered Modular Soft Robots |
|
| Cheng, Hsin | Princeton University |
| Zheng, Zhiwu | Princeton University |
| Kumar, Prakhar | Princeton University |
| Afridi, Wali | Ithaca Senior High School |
| Kim, Ben | Princeton University |
| Wagner, Sigurd | Princeton University |
| Verma, Naveen | Princeton University |
| Sturm, James | Princeton University |
| Chen, Minjie | Princeton University |
Keywords: Modeling, Control, and Learning for Soft Robots
Abstract: Soft robots present unique capabilities, but have been limited by the lack of scalable technologies for construction and the complexity of algorithms for efficient control and motion. These depend on soft-body dynamics, high-dimensional actuation patterns, and external/onboard forces. This paper presents scalable methods and platforms to study the impact of weight distribution and actuation patterns on fully untethered modular soft robots. An extendable Vibrating Intelligent Piezo-Electric Robot (eViper), together with an open-source Simulation Framework for Electroactive Robotic Sheet (SFERS) implemented in PyBullet, was developed as a platform to analyze the complex weight-locomotion interaction. By integrating power electronics, sensors, actuators, and batteries onboard, the eViper platform enables rapid design iteration and evaluation of different weight distribution and control strategies for the actuator arrays. The design supports both physics-based modeling and data-driven modeling via onboard automatic data-acquisition capabilities. We show that SFERS can provide useful guidelines for optimizing the weight distribution and actuation patterns of the eViper, thereby achieving maximum speed or minimum cost of transport (COT).
|
| |
| 10:00-11:30, Paper MoAIP-06.11 | Add to My Program |
| Domain Randomization for Robust, Affordable and Effective Closed-Loop Control of Soft Robots |
|
| Tiboni, Gabriele | Politecnico Di Torino |
| Protopapa, Andrea | Politecnico Di Torino |
| Tommasi, Tatiana | Politecnico Di Torino |
| Averta, Giuseppe | Politecnico Di Torino |
Keywords: Modeling, Control, and Learning for Soft Robots, Reinforcement Learning
Abstract: Soft robots are gaining popularity thanks to their intrinsic safety to contacts and adaptability. However, the potentially infinite number of Degrees of Freedom makes their modeling a daunting task, and in many cases only an approximated description is available. This challenge makes reinforcement learning (RL) based approaches inefficient when deployed on a realistic scenario, due to the large domain gap between models and the real platform. In this work, we demonstrate, for the first time, how Domain Randomization (DR) can solve this problem by enhancing RL policies for soft robots with: i) robustness w.r.t. unknown dynamics parameters; ii) reduced training times by exploiting drastically simpler dynamic models for learning; iii) better environment exploration, which can lead to exploitation of environmental constraints for optimal performance. Moreover, we introduce a novel algorithmic extension of previous adaptive domain randomization methods for the automatic inference of dynamics parameters for deformable objects. We provide an extensive evaluation in simulation on four different tasks and two soft robot designs, opening interesting perspectives for future research on Reinforcement Learning for closed-loop soft robot control.
|
| |
| 10:00-11:30, Paper MoAIP-06.12 | Add to My Program |
| Implementation of a Cosserat Rod-Based Configuration Tracking Controller on a Multi-Segment Soft Robotic Arm |
|
| Doroudchi, Azadeh | Arizona State University |
| Qiao, Zhi | ASU |
| Zhang, Wenlong | Arizona State University |
| Berman, Spring | Arizona State University |
Keywords: Modeling, Control, and Learning for Soft Robots, Motion Control, Distributed Robot Systems
Abstract: Controlling soft continuum robotic arms is challenging due to their hyper-redundancy and dexterity. In this paper we experimentally demonstrate, for the first time, closed-loop control of the configuration space variables of a soft robotic arm, composed of independently controllable segments, using a Cosserat rod model of the robot and the distributed sensing and actuation capabilities of the segments. Our controller solves the inverse dynamic problem by simulating the Cosserat rod model in MATLAB using a computationally efficient numerical solution scheme, and it applies the computed control output to the actual robot in real time. The position and orientation of the tip of each segment are measured in real time, while the remaining unknown variables that are needed to solve the inverse dynamics are estimated simultaneously in the simulation. We implement the controller on a multi-segment silicone robotic arm with pneumatic actuation, using a motion capture system to measure the segments' positions and orientations. The controller is used to reshape the arm into configurations that are achieved through combinations of bending and extension deformations in 3D space. Although the possible deformations are limited for this robot platform, our study demonstrates the potential for implementing the control approach on a wide range of continuum robots in practice. The resulting tracking performance indicates the effectiveness of the controller and the accuracy of the simulated Cosserat rod model.
|
| |
| 10:00-11:30, Paper MoAIP-06.13 | Add to My Program |
| Closed Loop Static Control of Multi-Magnet Soft Continuum Robots |
|
| Pittiglio, Giovanni | Harvard University |
| Orekhov, Andrew | Carnegie Mellon University |
| da Veiga, Tomas | University of Leeds |
| Cal�, Simone | University of Leeds |
| Chandler, James Henry | University of Leeds |
| Simaan, Nabil | Vanderbilt University |
| Valdastri, Pietro | University of Leeds |
Keywords: Force Control, Medical Robots and Systems, Formal Methods in Robotics and Automation
Abstract: This paper discusses a novel static control approach applied to magnetic soft continuum robots (MSCRs). Our aim is to demonstrate the control of a multi-magnet soft continuum robot (SCR) in 3D. The proposed controller, based on a simplified yet accurate model of the robot, has a high update rate and is capable of real-time shape control. For the actuation of the MSCR, we employ the dual external permanent magnet (dEPM) platform and we sense the shape via fiber Bragg grating (FBG). The employed actuation system and sensing technique makes the proposed approach directly applicable to the medical context. We demonstrate that the proposed controller, running at approximately 300 Hz, is capable of shape tracking with a mean error of 8.5% and maximum error of 35.2% .We experimentally show that the static controller is 25.9% more accurate than a standard PID controller in shape tracking
|
| |
| MoAIP-07 Regular session, Hall E |
Add to My Program |
| Clone of 'Cooperating Robots' |
|
| |
| |
| 10:00-11:30, Paper MoAIP-07.1 | Add to My Program |
| IF-Based Trajectory Planning and Cooperative Control for Transportation System of Cable Suspended Payload with Multi UAVs |
|
| Zhang, Yu | Northeastern University, China |
| Xu, Jie | Northeastern University, China |
| Zhao, Cheng | Northeastern University, China |
| Dong, Jiuxiang | Northeastern University, China |
Keywords: Distributed Robot Systems, Multi-Robot Systems, Path Planning for Multiple Mobile Robots or Agents
Abstract: In this paper, we tackle the control and trajectory planning problems for the cooperative transportation system of cable-suspended payload with multi Unmanned Aerial Vehicles (UAVs). Firstly, a payload controller is presented considering the dynamic coupling between the UAV and the payload to accomplish the active suppression of payload swing and the complex payload trajectory tracking. Secondly, different from the simplification of obstacles in most approaches, we propose three Insetting Formation (IF) algorithms for the complete obstacle shape to generate collision-free waypoints for the cooperative transportation system. An IF strategy is proposed by integrating three IF algorithms to improve the success rate of obstacle avoidance and reduce the algorithm complexity for performing the aggressive flight. Finally, we verify the robustness and high performance of the proposed algorithm through benchmark comparison and real-world experiments. Moreover, our source code is released as an open-source ros package.
|
| |
| 10:00-11:30, Paper MoAIP-07.2 | Add to My Program |
| Cooperative Dual-Arm Control for Heavy Object Manipulation Based on Hierarchical Quadratic Programming |
|
| Dio, Maximilian | Friedrich-Alexander-Universit�t Erlangen-N�rnberg |
| V�lz, Andreas | Friedrich-Alexander-Universit�t Erlangen-N�rnberg |
| Graichen, Knut | Friedrich Alexander University Erlangen-N�rnberg |
Keywords: Cooperating Robots, Dual Arm Manipulation, Optimization and Optimal Control
Abstract: This paper presents a new control scheme for cooperative dual-arm robots manipulating heavy objects. The proposed method uses the full dynamical model of the kinematically coupled robot system and builds on a gls{hqp} formulation to enforce dynamical inequality constraints such as joint torques or internal loads. This ensures optimal tracking of an object trajectory, while additional objectives with lower priority are optimized on the prior solution space. Therefore, the redundancy of the inherent load distribution problem between the two arms can be eliminated. With this approach, higher object loads can be manipulated compared to non-optimized methods. Simulations with a 14~gls{dof} dual-arm robotic system demonstrate the effectiveness of the proposed control method. The real-time feasibility is guaranteed with an average computation time of less than 0.35 milliseconds at a control rate of 1 kilohertz.
|
| |
| 10:00-11:30, Paper MoAIP-07.3 | Add to My Program |
| Multi-UAV Adaptive Path Planning Using Deep Reinforcement Learning |
|
| Westheider, Jonas | University Bonn |
| R�ckin, Julius | University of Bonn |
| Popovic, Marija | University of Bonn |
Keywords: Path Planning for Multiple Mobile Robots or Agents, Reinforcement Learning, Cooperating Robots
Abstract: Efficient aerial data collection is important in many remote sensing applications. In large-scale monitoring scenarios, deploying a team of unmanned aerial vehicles (UAVs) offers improved spatial coverage and robustness against individual failures. However, a key challenge is cooperative path planning for the UAVs to efficiently achieve a joint mission goal. We propose a novel multi-agent informative path planning approach based on deep reinforcement learning for adaptive terrain monitoring scenarios using UAV teams. We introduce new network feature representations to effectively learn path planning in a 3D workspace. By leveraging a counterfactual baseline, our approach explicitly addresses credit assignment to learn cooperative behaviour. Our experimental evaluation shows improved planning performance, i.e. maps regions of interest more quickly, with respect to non-counterfactual variants. Results on synthetic and real-world data show that our approach has superior performance compared to state-of-the-art non-learning-based methods, while being transferable to varying team sizes and communication constraints.
|
| |
| 10:00-11:30, Paper MoAIP-07.4 | Add to My Program |
| Collective Intelligence for 2D Push Manipulations with Mobile Robots |
|
| Kuroki, So | The University of Tokyo |
| Matsushima, Tatsuya | The University of Tokyo |
| Jumpei, Arima | Matsuo Institute |
| Furuta, Hiroki | The University of Tokyo |
| Matsuo, Yutaka | The University of Tokyo |
| Gu, Shixiang Shane | OpenAI |
| Tang, Yujin | Google |
Keywords: Cooperating Robots, Mobile Manipulation, Imitation Learning
Abstract: While natural systems often present collective intelligence that allows them to self-organize and adapt to changes, the equivalent is missing in most artificial systems. We explore the possibility of such a system in the context of cooperative 2D push manipulations using mobile robots. Although conventional works demonstrate potential solutions for the problem in restricted settings, they have computational and learning difficulties. More importantly, these systems do not possess the ability to adapt when facing environmental changes. In this work, we show that by distilling a planner derived from a differentiable soft-body physics simulator into an attentionbased neural network, our multi-robot push manipulation system achieves better performance than baselines. In addition, our system also generalizes to configurations not seen during training and is able to adapt toward task completions when external turbulence and environmental changes are applied.
|
| |
| 10:00-11:30, Paper MoAIP-07.5 | Add to My Program |
| Emergent Cooperative Behavior in Distributed Target Tracking with Unknown Occlusions |
|
| Li, Tianqi | Texas A&M University |
| Krakow, Lucas | Texas A&M University |
| Gopalswamy, Swaminathan | Texas A&M University |
Keywords: Cooperating Robots, Reactive and Sensor-Based Planning, Behavior-Based Systems
Abstract: Tracking multiple moving objects of interest (OOI) with multiple robot systems (MRS) has been addressed by active sensing that maintains a shared belief of OOIs and plans the motion of robots to maximize the information quality. Mobility of robots enables the behavior of pursuing better visibility, which is constrained by sensor field of view (FoV) and occlusion objects. We first extend prior work to detect, maintain and share occlusion information explicitly, allowing us to generate occlusion-aware planning even if a priori semantic occlusion information is unavailable. The efficacy of active sensing approaches is often evaluated according to estimation error and information gain metrics. However, these metrics do not directly explain the level of cooperative behavior engendered by the active sensing algorithms. Next, we extract different emergent cooperative behaviors that stem from the same underlying algorithms but manifest differently under differing scenarios. In particular, we highlight and demonstrate three emergent behavior patterns in active sensing MRS: (i) Change of tracking responsibility between agents when tracking trajectories with divergent directions or due to a re-allocation of the resource among heterogeneous agents; (ii) Awareness of occlusions to a trajectory and temporal leave-and-return of the sensing agent; (iii) Sharing of local occlusion objects in MRS that subsequently improves the awareness of occlusion.
|
| |
| 10:00-11:30, Paper MoAIP-07.6 | Add to My Program |
| Multi-Objective Sparse Sensing with Ergodic Optimization |
|
| Rao, Ananya | Carnegie Mellon University |
| Choset, Howie | Carnegie Mellon University |
Keywords: Motion and Path Planning, Path Planning for Multiple Mobile Robots or Agents, Multi-Robot Systems
Abstract: We consider a search problem where a robot has one or more types of sensors, each suited to detecting different types of targets or target information. Often, information in the form of a distribution of possible target locations, or locations of interest, may be available to guide the search. When multiple types of information exist, then a distribution for each type of information must also exist, thereby making the search problem that uses these distributions to guide the search a multi-objective one. In this paper, we consider a multi-objective search problem when the �cost� to use a sensor is limited. To this end, we leverage the ergodic metric, which drives agents to spend time in regions proportional to the expected amount of information there. We define the multi-objective sparse sensing ergodic (MO-SS-E) metric in order to optimize when and where each sensor measurement should be taken while planning trajectories that balance the multiple objectives. We observe that our approach maintains coverage performance as the number of samples taken considerably degrades. Further empirical results on different multi-agent problem setups demonstrate the applicability of our approach for both homogeneous and heterogeneous multi-agent teams.
|
| |
| 10:00-11:30, Paper MoAIP-07.7 | Add to My Program |
| Team Coordination on Graphs with State-Dependent Edge Costs |
|
| Limbu, Manshi | George Mason University |
| Hu, Zechen | George Mason University |
| Oughourli, Sara | George Mason University |
| Wang, Xuan | George Mason University |
| Xiao, Xuesu | George Mason University |
| Shishika, Daigo | George Mason University |
Keywords: Planning, Scheduling and Coordination, Cooperating Robots, Multi-Robot Systems
Abstract: This paper studies a team coordination problem in a graph environment. Specifically, we incorporate �support� action which an agent can take to reduce the cost for its teammate to traverse some high cost edges. Due to this added feature, the graph traversal is no longer a standard multi-agent path planning problem. To solve this new problem, we propose a novel formulation that poses it as a planning problem in a joint state space: the joint state graph (JSG). Since the edges of JSG implicitly incorporate the support actions taken by the agents, we are able to now optimize the joint actions by solving a standard single-agent path planning problem in JSG. One main drawback of this approach is the curse of dimensionality in both the number of agents and the size of the graph. To improve scalability in graph size, we further propose a hierarchical decomposition method to perform path planning in two levels. We provide both theoretical and empirical complexity analyses to demonstrate the efficiency of our two algorithms.
|
| |
| 10:00-11:30, Paper MoAIP-07.8 | Add to My Program |
| Incorporating Stochastic Human Driving States in Cooperative Driving between a Human-Driven Vehicle and an Autonomous Vehicle |
|
| Hossain, Sanzida | Oklahoma State University |
| Lu, Jiaxing | Oklahoma State University |
| Bai, He | Oklahoma State University |
| Sheng, Weihua | Oklahoma State University |
Keywords: Cooperating Robots, Intelligent Transportation Systems, Human Factors and Human-in-the-Loop
Abstract: Modeling a human-driven vehicle is a difficult subject since human drivers have a variety of stochastic behavioral components that influence their driving styles. We develop a cooperative driving framework to incorporate different human behavior aspects, including the attentiveness of a driver and the tendency of the driver following advising commands. To demonstrate the framework, we consider the merging coordination between a human-driven vehicle and an autonomous vehicle (AV) in a connected environment. We propose a stochastic model predictive controller (sMPC) to address the stochasticity in human driving behavior and design coordinated merging actions to optimize the AV input and influence human driving behavior through advising commands. Simulation and human-in-the-loop (HITL) experimental results show that our formulation is capable of accommodating a distracted driver and optimizing AV inputs based on human driving behavior recognition.
|
| |
| 10:00-11:30, Paper MoAIP-07.9 | Add to My Program |
| Epistemic Planning for Heterogeneous Robotic Systems |
|
| Bramblett, Lauren | University of Virginia |
| Bezzo, Nicola | University of Virginia |
Keywords: Cooperating Robots, Path Planning for Multiple Mobile Robots or Agents, Task and Motion Planning
Abstract: In applications such as search and rescue or disaster relief, heterogeneous multi-robot systems (MRS) can provide significant advantages for complex objectives that require a suite of capabilities. However, within these application spaces, communication is often unreliable, causing inefficiencies or outright failures to arise in most MRS algorithms. Many researchers tackle this problem by requiring all robots to either maintain communication using proximity constraints or assuming that all robots will execute a predetermined plan over long periods of disconnection. The latter method allows for higher levels of efficiency in a MRS, but failures and environmental uncertainties can have cascading effects across the system, especially when a mission objective is complex or time-sensitive. To solve this, we propose an epistemic planning framework that allows robots to reason about the system state, leverage heterogeneous system makeups, and optimize information dissemination to disconnected neighbors. Dynamic epistemic logic formalizes the propagation of belief states, and epistemic task allocation and gossip is accomplished via a mixed integer program using the belief states for utility predictions and planning. The proposed framework is validated using simulations and experiments with heterogeneous vehicles.
|
| |
| 10:00-11:30, Paper MoAIP-07.10 | Add to My Program |
| Reinforced Potential Field for Multi-Robot Motion Planning in Cluttered Environments |
|
| Zhang, Dengyu | Sun Yat-Sen University |
| Zhang, Xinyu | Sun Yat-Sen University |
| Zhang, Zheng | Sun Yat-Sen University |
| Zhu, Bo | Sun Yat-Sen University |
| Zhang, Qingrui | Sun Yat-Sen University |
Keywords: Multi-Robot Systems, Motion and Path Planning, Collision Avoidance
Abstract: Motion planning is challenging for multiple robots in cluttered environments without communication, especially in view of real-time efficiency, motion safety, distributed computation, and trajectory optimality, etc. In this paper, a reinforced potential field method is developed for distributed multi-robot motion planning, which is a synthesized design of reinforcement learning and artificial potential fields. An observation embedding with a self-attention mechanism is presented to model the robot-robot and robot-environment interactions. A soft wall-following rule is developed to improve the trajectory smoothness. Our method belongs to reactive planning, but environment properties are implicitly encoded. The total amount of robots in our method can be scaled up to any number. The performance improvement over a vanilla APF and RL method has been demonstrated via numerical simulations. Experiments are also performed using quadrotors to further illustrate the competence of our method.
|
| |
| 10:00-11:30, Paper MoAIP-07.11 | Add to My Program |
| Robot Team Data Collection with Anywhere Communication |
|
| Schack, Matthew | Colorado School of Mines |
| Rogers III, John G. | US Army Research Laboratory |
| Han, Qi | Colorado School of Mines |
| Dantam, Neil | Colorado School of Mines |
Keywords: Multi-Robot Systems, Cooperating Robots, Path Planning for Multiple Mobile Robots or Agents
Abstract: Using robots to collect data is an effective way to obtain information from the environment and communicate it to a static base station. Furthermore, robots have the capability to communicate with one another, potentially decreasing the time for data to reach the base station. We present a Mixed Integer Linear Program that reasons about discrete routing choices, continuous robot paths, and their effect on the latency of the data collection task. We analyze our formulation, discuss optimization challenges inherent to the data collection problem, and propose a factored formulation that finds optimal answers more efficiently. Our work is able to find paths that reduce latency by up to 101% compared to treating all robots independently in our tested scenarios.
|
| |
| 10:00-11:30, Paper MoAIP-07.12 | Add to My Program |
| Coordination of Multiple Mobile Manipulators for Ordered Sorting of Cluttered Objects |
|
| Ahn, Jeeho | Korea University |
| Lee, Sebin | Sogang University |
| Nam, Changjoo | Sogang University |
Keywords: Cooperating Robots, Multi-Robot Systems, Manipulation Planning
Abstract: We present a coordination method for multiple mobile manipulators to sort objects in clutter. We consider the object rearrangement problem in which the objects must be sorted into different groups in a particular order. In clutter, the order constraints could not be easily satisfied since some objects occlude other objects so the occluded ones are not directly accessible to the robots. Those objects occluding others need to be moved more than once to make the occluded objects accessible. Such rearrangement problems fall into the class of nonmonotone rearrangement problems which are computationally intractable. While the nonmonotone problems with order constraints are harder, involving with multiple robots requires another computation for task allocation. In this work, we aim to develop a fast method, albeit suboptimally, for the multi-robot coordination for ordered sorting in clutter. The proposed method finds a sequence of objects to be sorted using a search such that the order constraint in each group is satisfied. The search can solve nonmonotone instances that require temporal relocation of some objects to access the next object to be sorted. Once a complete sorting sequence is found, the objects in the sequence are assigned to multiple mobile manipulators using a greedy task allocation method. We develop four versions of the method with different search strategies. In the experiments, we show that our method can find a sorting sequence quickly (e.g., 4.6 sec with 20 objects sorted into five groups) even though the solved instances include hard nonmonotone ones. The extensive tests and the experiments in simulation show the ability of the method to solve the real-world sorting problem using multiple mobile manipulators.
|
| |
| 10:00-11:30, Paper MoAIP-07.13 | Add to My Program |
| MOTLEE: Distributed Mobile Multi-Object Tracking with Localization Error Elimination |
|
| Peterson, Mason B. | Massachusetts Institute of Technology |
| Lusk, Parker C. | Massachusetts Institute of Technology |
| How, Jonathan | Massachusetts Institute of Technology |
Keywords: Distributed Robot Systems, Visual Tracking, Localization
Abstract: We present MOTLEE, a distributed mobile multi-object tracking algorithm that enables a team of robots to collaboratively track moving objects in the presence of localization error. Existing approaches to distributed tracking make limiting assumptions regarding the relative spatial relationship of sensors, including assuming a static sensor network or that perfect localization is available. Instead, we develop an algorithm based on the Kalman-Consensus filter for distributed tracking that properly leverages localization uncertainty in collaborative tracking. Further, our method allows the team to maintain an accurate understanding of dynamic objects in the environment by realigning robot frames and incorporating frame alignment uncertainty into our object tracking formulation. We evaluate our method in hardware on a team of three mobile ground robots tracking four people. Compared to previous works that do not account for localization error, we show that MOTLEE is resilient to localization uncertainties, enabling accurate tracking in distributed, dynamic settings with mobile tracking sensors.
|
| |
| MoAIP-08 Regular session, Hall E |
Add to My Program |
| Clone of 'Legged Robots I' |
|
| |
| |
| 10:00-11:30, Paper MoAIP-08.1 | Add to My Program |
| Dynamic Object Tracking for Quadruped Manipulator with Spherical Image-Based Approach |
|
| Zhang, Tianlin | Harbin Institute of Technology |
| Guo, Sikai | Harbin Institute of Technology |
| Xiong, Xiaogang | Harbin Institute of Technology, Shenzhen |
| Li, Wanlei | Harbin Institute of Technology(ShenZhen) |
| Qi, Zezheng | Harbin Institute of Technology, Shenzhen |
| Lou, Yunjiang | Harbin Institute of Technology, Shenzhen |
Keywords: Legged Robots, Visual Servoing, Visual Tracking
Abstract: Exactly estimating and tracking the motion of surrounding dynamic objects is one of important tasks for the autonomy of a quadruped manipulator. However, with only an onboard RGB camera, it is still a challenging work for a quadruped manipulator to track the motion of a dynamic object moving with unknown and changing velocities. To address this problem, this manuscript proposes a novel image-based visual servoing (IBVS) approach consisting of three elements: a spherical projection model, a robust super-twisting observer, and a model predictive controller (MPC). The spherical projection model decouples the visual error of the dynamic target into linear and angular ones. Then, with the presence of the visual error, the robustness of the observer is exploited to estimate the unknown and changing velocities of the dynamic target without depth estimation. Finally, the estimated velocity is fed into the model predictive controller (MPC) to generate joint torques for the quadruped manipulator to track the motion of the dynamical target. The proposed approach is validated through hardware experiments and the experimental results illustrate the approach's effectiveness in improving the autonomy of the quadruped manipulator.
|
| |
| 10:00-11:30, Paper MoAIP-08.2 | Add to My Program |
| Proprioception and Tail Control Enable Extreme Terrain Traversal by Quadruped Robots |
|
| Yang, Yanhao | Oregon State University |
| Norby, Joseph | Apptronik |
| Yim, Justin K. | University of Illinois Urbana-Champaign |
| Johnson, Aaron M. | Carnegie Mellon University |
Keywords: Legged Robots, Biologically-Inspired Robots, Optimization and Optimal Control
Abstract: Legged robots leverage ground contacts and the reaction forces they provide to achieve agile locomotion. However, uncertainty coupled with contact discontinuities can lead to failure, especially in real-world environments with unexpected height variations such as rocky hills or curbs. To enable dynamic traversal of extreme terrain, this work introduces 1) a proprioception-based gait planner for estimating unknown hybrid events due to elevation changes and responding by modifying contact schedules and planned footholds online, and 2) a two-degree-of-freedom tail for improving contact-independent control and a corresponding decoupled control scheme for better versatility and efficiency. Simulation results show that the gait planner significantly improves stability under unforeseen terrain height changes compared to methods that assume fixed contact schedules and footholds. Further, tests have shown that the tail is particularly effective at maintaining stability when encountering a terrain change with an initial angular disturbance. The results show that these approaches work synergistically to stabilize locomotion with elevation changes up to 1.5 times the leg length and tilted initial states.
|
| |
| 10:00-11:30, Paper MoAIP-08.3 | Add to My Program |
| Run and Catch: Dynamic Object-Catching of Quadrupedal Robots |
|
| You, Yangwei | Institute for Infocomm Research |
| Liu, Tianlin | Peking University |
| Liang, Xiaowei | Beijing Xiaomi Mobile Software Co., Ltd |
| Xu, Zhe | Beijing Institute of Technology |
| Zhou, Mingliang | Beijing Xiaomi Mobile Software Co., Ltd |
| Li, Zhibin (Alex) | University College London |
| Zhang, Shiwu | University of Science and Technology of China |
Keywords: Legged Robots, Whole-Body Motion Planning and Control, Climbing Robots
Abstract: Quadrupedal robots are performing increasingly more real-world capabilities, but are primarily limited to locomotion tasks. To expand their task-level abilities of object acquisition, i.e., run-to-catch as frisbee catching for dogs, this paper developed a control pipeline using stereo vision for legged robots which allows for dynamic catching balls while the robot is in motion. To achieve high-frame-rate tracking, we designed a ball that can actively emit homogeneous infrared (IR) light and then located the flying ball based on binocular vision positioning using the onboard RealSense D450 camera with an additional IR bandpass filter. The camera was mounted on top of a 2-DoF head to gain a full view of the target ball. A state estimation module was developed to fuse the vision positioning, camera motor readings, localization result of RealSense T265 equipped on the back, and the legged odometry output altogether. With the use of a ballistic model, we achieved a robust estimation of both the ball and robot positions in an inertial coordinate. Additionally, we developed a close-loop catching strategy and employed trajectory prediction so that tracking and run-to-catch were performed simultaneously, which is critical for such drastically dynamic and precise tasks. The proposed approach was validated through both static testing and dynamic catch experiments conducted on the CyberDog robot with a high success rate.
|
| |
| 10:00-11:30, Paper MoAIP-08.4 | Add to My Program |
| A Composite Control Strategy for Quadruped Robot by Integrating Reinforcement Learning and Model-Based Control |
|
| Lyu, Shangke | Nanyang Technological University |
| Zhao, Han | Beijing University of Posts and Telecommunications |
| Wang, Donglin | Westlake University |
Keywords: Legged Robots, Motion Control, Reinforcement Learning
Abstract: Locomotion in the wild requires the quadruped robot to have strong capabilities in adaptation and robustness. The deep reinforcement learning (DRL) exhibits the huge potential in environmental adaptability, while its stability issues remain open. On the other hand, the quadruped robot dynamic model contains a lot of useful information that is beneficial to the robust control. The combination of DRL with model-based control may take both strengths and hold promises in better robustness. In this paper, the DRL and the proposed model-based controller are firmly integrated in a novel manner such that the proposed model-based controller is able to rectify the gait commands generated by DRL based on the system dynamic model so as to enhance the robustness of the quadruped robot against the external disturbances. Besides, a potential energy function is introduced to achieve the compliant contact. The stability of the proposed method is ensured in terms of passivity analysis. Several physical experiments are carried out to verify the performance of the proposed method.
|
| |
| 10:00-11:30, Paper MoAIP-08.5 | Add to My Program |
| Load Awareness: Sensorless Body Payload Sensing and Localization for Heavy Quadruped Robot |
|
| Liu, Shaoxun | Shanghai Jiao Tong University |
| Zhou, Shiyu | Shanghai Jiao Tong University |
| Pan, Zheng | Shanghai Jiao Tong University |
| Niu, Zhihua | Shanghai Jiao Tong University |
| Wang, Rongrong | Shanghai Jiao Tong University |
Keywords: Legged Robots, Contact Modeling, Dynamics
Abstract: Heavy quadrupedal drives have great potential for overcoming obstacles, showing great possibilities for transportation industries in complex environments. Ground reaction force (GRF) is a crucial state variable for quadrupedal control. Most GRF observations are implemented in lightweight quadrupeds, with little consideration of the loading being static or slippery on the body. However, the load information is vital to the heavy-duty quadruped applied in transportation tasks. In this paper, we disassembled the whole-body dynamics into the body dynamics combined with the individual floating single-leg dynamics and completed observing the virtual coupling effects between the body and legs. Based on the observed coupling force and centroidal dynamics (CD), the GRF of a stance leg is obtained without the awareness of body weight, movement, and load information. Furthermore, we utilized the body dynamics and the observed virtual force to obtain the body's unknown payload. By reconstructing the moment balance equation, we obtained the payload's position concerning the body coordinate. Compared to conventional quadrupedal GRF observation methods, this framework achieves higher observation accuracy in heavy quadrupeds without load and body information. Additionally, it enables real-time calculation of load magnitude and position.
|
| |
| 10:00-11:30, Paper MoAIP-08.6 | Add to My Program |
| Evolutionary-Based Online Motion Planning Framework for Quadruped Robot Jumping |
|
| Yue, Linzhu | The Chinese University of Hong Kong |
| Song, Zhitao | The Chinese University of Hong Kong |
| Zhang, Hongbo | The Chinese University of Hong Kong |
| Zhang, Lingwei | Hong Kong Centre for Logistics Robotics |
| Zeng, Xuanqi | Chinese University of Hong Kong |
| Liu, Yunhui | Chinese University of Hong Kong |
Keywords: Legged Robots, Whole-Body Motion Planning and Control, Motion and Path Planning
Abstract: Offline evolutionary-based methodologies have supplied a successful motion planning framework for the quadrupedal jump. However, the time-consuming computation caused by massive population evolution in offline evolutionary-based jumping framework significantly limits the popularity in the quadrupedal field. This paper presents a time-friendly online motion planning framework, based on meta-heuristic Differential evolution (DE), Latin hypercube sampling, and Configuration space (DLC). The DLC framework establishes a multidimensional optimization problem leveraging centroidal dynamics to determine the ideal trajectory of the center of mass (CoM) and ground reaction forces (GRFs). The configuration space is introduced to the evolutionary optimization in order to condense the searching region. Latin hypercube sampling offers more uniform initial populations of DE under limited sampling points, which accelerates away from a local minimum. This research also constructs a collection of pre-motion trajectories as a warm start, when the objective state is in the neighborhood of the pre-motion state, to drastically reduce the solving time. The proposed methodology is successfully validated via real robot experiments for online jumping trajectory optimization with different jumping motions (e.g., ordinary jumping, flipping, and spinning).
|
| |
| 10:00-11:30, Paper MoAIP-08.7 | Add to My Program |
| Multi-IMU Proprioceptive Odometry for Legged Robots |
|
| Yang, Shuo | Carnegie Mellon University |
| Zhang, Zixin | Carnegie Mellon University |
| Bokser, Benjamin | Boston Dynamics AI Institute |
| Manchester, Zachary | Carnegie Mellon University |
Keywords: Legged Robots, Sensor Fusion, Contact Modeling
Abstract: This paper presents a novel, low-cost proprioceptive sensing solution for legged robots with point feet to achieve accurate low-drift long-term position and velocity estimation. In addition to conventional sensors, including one body Inertial Measurement Unit (IMU) and joint encoders, we attach an additional IMU to each calf link of the robot just above the foot. An extended Kalman filter is used to fuse data from all sensors to estimate the robot's body and foot positions in the world frame. Using the additional IMUs, the filter is able to reliably determine foot contact modes and detect foot slips without tactile or pressure-based foot contact sensors. This sensing solution is validated in various hardware experiments, which confirm that it can reduce position drift by nearly an order of magnitude compared to conventional approaches with only a very modest increase in hardware and computational costs.
|
| |
| 10:00-11:30, Paper MoAIP-08.8 | Add to My Program |
| Design and Motion Guidelines for Quadrupedal Locomotion of Maximum Speed or Efficiency with Serial and Parallel Legs |
|
| Machairas, Konstantinos | National Technical University of Athens |
| Papadopoulos, Evangelos | National Technical University of Athens |
Keywords: Legged Robots, Task and Motion Planning, Mechanism Design
Abstract: Analytical expressions are derived for actuator demands in quadrupedal locomotion of constant speed and height by using a reduction from a trot/ pace 6-bar model to a single-legged model and employing two widely used two-segmented leg architectures, the serial and the parallel. A method is developed that outputs optimal gait characteristics and leg designs for a robot to move with maximum efficiency or speed. Also, generic guidelines are presented, which answer questions such as: which speed should be selected for maximum efficiency, or which is the optimal leg architecture (serial/ parallel) and leg length for maximum efficiency or speed.
|
| |
| 10:00-11:30, Paper MoAIP-08.9 | Add to My Program |
| Towards Legged Locomotion on Steep Planetary Terrain |
|
| Valsecchi, Giorgio | Robotic System Lab, ETH |
| Weibel, Cedric | ETH Zuerich |
| Kolvenbach, Hendrik | ETH Zurich |
| Hutter, Marco | ETH Zurich |
Keywords: Legged Robots, Space Robotics and Automation, Reinforcement Learning
Abstract: Scientific exploration of planetary bodies is an activity well-suited for robots. Unfortunately, the regions that are richer in potential discoveries, such as impact craters, caves, and vulcanic terraces, are hard to access with wheeled robots. Recent advances in legged-based approaches have shown the potential of the technology to overcome difficult terrains such as slopes and slippery surfaces. In this work, we focus on locomotion for sandy slopes, comparing baseline state-of-the-art walking policies with a novel crawling-based gait for quadrupedal robots. We fine-tuned a state-of-the-art locomotion framework and introduced hardware modifications to the robot ANYmal, which enables walking on its knees. Moreover, we integrated a novel metric for stability, the stability margin, in the training process to increase robustness in such conditions. We benchmarked the locomotion policies in simulation and in real-world experiments on martian soil simulant. Results show an improvement in locomotion performance and a more robust gait at higher slope angles.
|
| |
| 10:00-11:30, Paper MoAIP-08.10 | Add to My Program |
| Dynamic Hybrid Locomotion and Jumping for Wheeled-Legged Quadrupeds |
|
| Hosseini, Mojtaba | University of Bonn |
| Rodriguez, Diego | University of Bonn |
| Behnke, Sven | University of Bonn |
Keywords: Legged Robots, Wheeled Robots, Whole-Body Motion Planning and Control
Abstract: Hybrid wheeled-legged quadrupeds have the potential to navigate challenging terrain with agility and speed and over long distances. However, obstacles can impede their progress by requiring the robots to either slow down to step over obstacles or modify their path to circumvent the obstacles. We propose a motion optimization framework for quadruped robots that incorporates non-steerable wheels and dynamic jumps, enabling them to perform hybrid wheeled-legged locomotion while overcoming obstacles without slowing down. Our approach involves a model predictive controller that uses a time-varying rigid body dynamics model of the robot, including legs and wheels, to track dynamic motions such as jumping. We also introduce a method for driving with minimal leg swings to reduce energy consumption by sparing the effort involved in lifting the wheels. Our method was tested successfully on the wheeled Mini Cheetah and the Unitree AlienGo robots. Further videos and results are available at https://www.ais.uni-bonn.de/%7ehosseini/iros2023
|
| |
| 10:00-11:30, Paper MoAIP-08.11 | Add to My Program |
| Quadrupedal Footstep Planning Using Learned Motion Models of a Black-Box Controller |
|
| Taouil, Ilyass | Istituto Italiano Di Tecnologia |
| Turrisi, Giulio | Istituto Italiano Di Tecnologia |
| Schleich, Daniel | University of Bonn |
| Barasuol, Victor | Istituto Italiano Di Tecnologia |
| Semini, Claudio | Istituto Italiano Di Tecnologia |
| Behnke, Sven | University of Bonn |
Keywords: Legged Robots, Motion and Path Planning, Machine Learning for Robot Control
Abstract: Legged robots are increasingly entering new domains and applications, including search and rescue, inspection, and logistics. However, for such systems to be valuable in real-world scenarios, they must be able to autonomously and robustly navigate irregular terrains. In many cases, robots that are sold on the market do not provide such abilities, being able to perform only blind locomotion. Furthermore, their controller cannot be easily modified by the end-user, requiring a new and time-consuming control synthesis. In this work, we present a local motion planning pipeline that extends the capabilities of a black-box walking controller that is only able to track high-level reference velocities. More precisely, we learn a set of motion models for such controller that maps high-level velocity commands to Center of Mass (CoM) and footstep motions. We then integrate these models with a variant of the A* algorithm to plan the CoM trajectory, footstep sequences, and corresponding high-level velocity commands based on visual information, allowing the quadruped to safely traverse irregular terrains at demand.
|
| |
| 10:00-11:30, Paper MoAIP-08.12 | Add to My Program |
| An Efficient Paradigm for Feasibility Guarantees in Legged Locomotion (I) |
|
| Abdalla, Abdelrahman | Italian Institute of Technology |
| Focchi, Michele | Universit� Di Trento |
| Orsolino, Romeo | Arrival Ltd |
| Semini, Claudio | Istituto Italiano Di Tecnologia |
Keywords: Legged Robots, Dynamics, Kinematics, Motion and Path Planning
Abstract: Developing feasible body trajectories for legged systems on arbitrary terrains is a challenging task. In this article, we present a paradigm that allows to design feasible Center of Mass (CoM) and body trajectories in an efficient manner. In our previous work (Orsolino et al., 2020), we introduced the notion of the two-dimensional feasible region, where static balance and the satisfaction of joint-torque limits were guaranteed, whenever the projection of the CoM lied inside the proposed admissible region. In this work, we propose a general formulation of the improved feasible region that guarantees dynamic balance alongside the satisfaction of both joint-torque and kinematic limits in an efficient manner. To incorporate the feasibility of the kinematic limits, we introduce an algorithm that computes the reachable region of the CoM. Furthermore, we propose an efficient planning strategy that utilizes the improved feasible region to design feasible CoM and body orientation trajectories. Finally, we validate the capabilities of the improved feasible region and the effectiveness of the proposed planning strategy, using simulations and experiments on the 90 kg hydraulically actuated quadruped and the 21 kg Aliengo robots.
|
| |
| MoAIP-09 Regular session, Hall E |
Add to My Program |
| Clone of 'Motion and Path Planning I' |
|
| |
| |
| 10:00-11:30, Paper MoAIP-09.1 | Add to My Program |
| Locomotion Planning of a Truss Robot on Irregular Terrain |
|
| Bae, Jangho | University of Pennsylvania |
| Park, Inha | Hanyang University |
| Yim, Mark | University of Pennsylvania |
| Seo, TaeWon | Hanyang University |
Keywords: Cellular and Modular Robots, Motion and Path Planning
Abstract: This paper proposes a new locomotion algorithm for truss robots on irregular terrain, in particular, for the Variable Topology Truss (VTT) system. The previous Polygon-based Random Tree (PRT) search algorithm for support polygon generation is extended to irregular terrain while considering friction and internal force limitations. By characterizing terrain, unreachable areas are excluded from search to increase efficiency. A one-step rolling motion primitive is generated based on the kinematics, statics, and constraints of VTT. The locomotion planning is completed by transforming and connecting multiple motion primitives with respect to the desired support polygons. The algorithm�s performance is verified by conducting simulations in multiple types of environments.
|
| |
| 10:00-11:30, Paper MoAIP-09.2 | Add to My Program |
| A Model Predictive Path Integral Method for Fast, Proactive, and Uncertainty-Aware UAV Planning in Cluttered Environments |
|
| Higgins, Jacob | University of Virginia |
| Mohammad, Nicholas | University of Virginia |
| Bezzo, Nicola | University of Virginia |
Keywords: Motion and Path Planning, Autonomous Vehicle Navigation, Aerial Systems: Mechanics and Control
Abstract: Current motion planning approaches for autonomous mobile robots often assume that the low level controller of the system is able to track the planned motion with very high accuracy. In practice, however, tracking error can be affected by many factors, and could lead to potential collisions when the robot must traverse a cluttered environment. To address this problem, this paper proposes a novel receding-horizon motion planning approach based on Model Predictive Path Integral (MPPI) control theory -- a flexible sampling-based control technique that requires minimal assumptions on vehicle dynamics and cost functions. This flexibility is leveraged to propose a motion planning framework that also considers a data-informed risk function. Using the MPPI algorithm as a motion planner also reduces the number of samples required by the algorithm, relaxing the hardware requirements for implementation. The proposed approach is validated through trajectory generation for a quadrotor unmanned aerial vehicle (UAV), where fast motion increases trajectory tracking error and can lead to collisions with nearby obstacles. Simulations and hardware experiments demonstrate that the MPPI motion planner proactively adapts to the obstacles that the UAV must negotiate, slowing down when near obstacles and moving quickly when away from obstacles, resulting in a complete reduction of collisions while still producing lively motion.
|
| |
| 10:00-11:30, Paper MoAIP-09.3 | Add to My Program |
| Energy-Efficient Team Orienteering Problem in the Presence of Time-Varying Ocean Currents |
|
| Mansfield, Ariella | University of Pennsylvania |
| G. Macharet, Douglas | Universidade Federal De Minas Gerais |
| Hsieh, M. Ani | University of Pennsylvania |
Keywords: Task and Motion Planning, Multi-Robot Systems, Planning, Scheduling and Coordination
Abstract: Autonomous Marine Vehicles (AMVs) have gained interest for scientific and commercial applications, including pipeline and algae bloom monitoring, contaminant tracking, and ocean debris removal. The Team Orienteering Problem (TOP) is relevant in this context as Multi-Robot Systems (MRSs) allow for better coverage of the area of interest, simultaneous data collection at different locations, and an increase in the overall robustness and efficiency of the mission. However, route planning for AMVs in dynamic ocean environments is challenging due to the coupling of environmental and vehicle dynamics. We propose a multi-objective formulation that accounts for the trade-offs between visiting multiple task locations and energy consumption by the vehicles subject to a time budget. Different from existing approaches, our method is able to leverage time-varying ocean currents to improve the energy efficiency of resulting routes. We validate our approach experimentally by superimposing ocean flow models with benchmark instances of the TOP.
|
| |
| 10:00-11:30, Paper MoAIP-09.4 | Add to My Program |
| Multi-Agent Multi-Objective Ergodic Search Using Branch and Bound |
|
| Kesarimangalam Srinivasan, Akshaya | Carnegie Mellon University |
| Gutow, Geordan | Carnegie Mellon University |
| Ren, Zhongqiang | Carnegie Mellon University |
| Abraham, Ian | Yale University |
| Vundurthy, Bhaskar | Carnegie Mellon University |
| Choset, Howie | Carnegie Mellon University |
Keywords: Task and Motion Planning, Multi-Robot Systems, Path Planning for Multiple Mobile Robots or Agents
Abstract: Search and rescue applications often need multiple agents to complete a set of conflicting tasks. This paper studies a Multi-Agent Multi-Objective Ergodic Search (MA-MO-ES) approach to this problem where each objective or task is to cover a domain subject to an information map. The goal is to allocate tasks to agents so that all maps are covered ergodically. The combinatorial nature of task allocation makes it computationally expensive to solve optimally using brute force. Apart from a large number of possible allocations, computing the cost of a task allocation is itself a planning problem. To mitigate the computational challenge, we present a branch and bound-based algorithm with pruning techniques that reduce the number of allocations to be searched to find an optimal allocation. We also present an approach to leverage the similarity between information maps to further reduce computation. Extensive testing on 150 randomly generated test cases shows an order of magnitude improvement in runtime compared to an exhaustive brute force approach.
|
| |
| 10:00-11:30, Paper MoAIP-09.5 | Add to My Program |
| Leveraging Single-Goal Predictions to Improve the Efficiency of Multi-Goal Motion Planning with Dynamics |
|
| Lu, Yuanjie | George Mason University |
| Plaku, Erion | George Mason University |
Keywords: Motion and Path Planning, Nonholonomic Motion Planning
Abstract: Multi-goal motion planning requires a robot to plan collision-free and dynamically-feasible motions to reach multiple goals, often in unstructured, obstacle-rich environments. This is challenging due to the complex dependencies between navigation and high-level reasoning, requiring the robot to explore a vast space of feasible motions and goal sequences.Our approach combines machine learning and Traveling Salesman Problem (TSP) solvers with sampling-based motion planning. Machine learning predicts distances and directions between locations, considering obstacles and robot dynamics, which the TSP solver uses to compute promising tours. Sampling-based motion planning expands a motion tree to follow the tours along the predicted directions. We demonstrate the effectiveness of our approach through experiments with vehicle and snake-like robot models operating in unstructured environments with multiple goals.
|
| |
| 10:00-11:30, Paper MoAIP-09.6 | Add to My Program |
| DynGMP: Graph Neural Network-Based Motion Planning in Unpredictable Dynamic Environments |
|
| Zhang, Wenjin | Rutgers University |
| Zang, Xiao | Rutgers University |
| Huang, Lingyi | Rutgers University |
| Sui, Yang | Rutgers University |
| Yu, Jingjin | Rutgers University |
| Chen, Yingying | Rutgers University |
| Yuan, Bo | Rutgers University |
Keywords: Motion and Path Planning, Planning, Scheduling and Coordination, Deep Learning Methods
Abstract: Abstract� Neural networks have already demonstrated attractive performance for solving motion planning problems, especially in static and predictable environments. However, efficient neural planners that can adapt to unpredictable dynamic environments, a highly demanded scenario in many practical applications, are still under-explored. To fill this research gap and enrich the existing motion planning approaches, in this paper, we propose DynGMP, a graph neural network (GNN)-based planner that provides high-performance planning solutions in unpredictable dynamic environments. By fully leveraging the prior exploration experience and minimizing the replanning the cost incurred by environmental change, DynGMP achieves high planning performance and efficiency simultaneously. Empirical evaluations across different environments show that DynGMP can achieve close to 100% success rate with fast planning speed and short path cost. Compared with existing non-learning and learning-based counterparts, DynGMP shows very significant planning performance improvement, e.g., at least 2.7�, 2.2�, 2.4� and 2� faster planning speed with low path distance in four environments, respectively.
|
| |
| 10:00-11:30, Paper MoAIP-09.7 | Add to My Program |
| Symbolic State Space Optimization for Long Horizon Mobile Manipulation Planning |
|
| Zhang, Xiaohan | SUNY Binghamton |
| Zhu, Yifeng | The University of Texas at Austin |
| Ding, Yan | SUNY Binghamton |
| Jiang, Yuqian | University of Texas at Austin |
| Zhu, Yuke | The University of Texas at Austin |
| Stone, Peter | University of Texas at Austin |
| Zhang, Shiqi | SUNY Binghamton |
Keywords: Task and Motion Planning, Mobile Manipulation, Service Robotics
Abstract: In existing task and motion planning (TAMP) research, it is a common assumption that experts manually specify the state space for task-level planning. A well-developed state space enables the desirable distribution of limited computational resources between task planning and motion planning. However, developing such task-level state spaces can be non-trivial in practice. In this paper, we consider a long horizon mobile manipulation domain including repeated navigation and manipulation. We propose Symbolic State Space Optimization (S3O) for computing a set of abstracted locations and their 2D geometric groundings for generating task-motion plans in such domains. Our approach has been extensively evaluated in simulation and demonstrated on a real mobile manipulator working on clearing up dining tables. Results show the superiority of the proposed method over TAMP baselines in task completion rate and execution time.
|
| |
| 10:00-11:30, Paper MoAIP-09.8 | Add to My Program |
| A Fast and Map-Free Model for Trajectory Prediction in Traffics |
|
| Xiang, Junhong | Chongqing University |
| Zhang, Jingmin | No. 208 Research Institute of China Ordnance Industries |
| Nan, Zhixiong | Chongqing University |
Keywords: Motion and Path Planning, Autonomous Agents, Deep Learning Methods
Abstract: To handle the two shortcomings of existing methods, (i) nearly all models rely on high-definition (HD) maps, yet the map information is not always available in real traffic scenes and HD map-building is expensive and time-consuming and (ii) existing models usually focus on improving prediction accuracy at the expense of reducing computing efficiency, yet the efficiency is crucial for various real applications, this paper proposes an efficient trajectory prediction model that is not dependent on traffic maps. The core idea of our model is encoding single-agent's spatial-temporal information in the first stage and exploring multi-agents' spatial-temporal interactions in the second stage. By comprehensively utilizing attention mechanism, LSTM, graph convolution network and temporal transformer in the two stages, our model is able to learn rich dynamic and interaction information of all agents. Our model achieves the highest performance when comparing with existing map-free methods and also exceeds most map-based state-of-the-art methods on the Argoverse dataset. In addition, our model also exhibits a faster inference speed than the baseline methods.
|
| |
| 10:00-11:30, Paper MoAIP-09.9 | Add to My Program |
| Local Non-Cooperative Games with Principled Player Selection for Scalable Motion Planning |
|
| Chahine, Makram | Massachusetts Institute of Technology |
| Firoozi, Roya | Stanford University |
| Xiao, Wei | MIT |
| Schwager, Mac | Stanford University |
| Rus, Daniela | MIT |
Keywords: Motion and Path Planning, Multi-Robot Systems, Aerial Systems: Applications
Abstract: Game-theoretic motion planners are a powerful tool for the control of interactive multi-agent robot systems. Indeed, contrary to predict-then-plan paradigms, game-theoretic planners do not ignore the interactive nature of the problem, and simultaneously predict the behaviour of other agents while considering change in one�s policy. This, however, comes at the expense of computational complexity, especially as the number of agents considered grows. In fact, planning with more than a handful of agents can quickly become intractable, disqualifying game-theoretic planners as possible candidates for large scale planning. In this paper, we propose a planning algorithm enabling the use of game-theoretic planners in robot systems with a large number of agents. Our planner is based on the reality of locality of information and thus deploys local games with a selected subset of agents in a receding horizon fashion to plan collision avoiding trajectories. We propose five different principled schemes for selecting game participants and compare their collision avoidance performance. We observe that the use of Control Barrier Functions for priority ranking is a potent solution to the player selection problem for motion planning.
|
| |
| 10:00-11:30, Paper MoAIP-09.10 | Add to My Program |
| Target Attribute Perception Based UAV Real-Time Task Planning in Dynamic Environments |
|
| He, Jinhong | Huazhong University of Science and Technology |
| Sun, Zheyu | Huazhong University of Science and Technology |
| Ming, Delie | Huazhong University of Science and Technology |
| Cai, Chao | Huazhong University of Science and Technology |
| Cao, Ningbo | Huazhong University of Science and Technology |
Keywords: Motion and Path Planning, Computer Vision for Automation, Deep Learning for Visual Perception
Abstract: In this paper, a comprehensive solution for enabling unmanned aerial vehicle (UAV) to autonomously fly through complex and dynamic environments is proposed. Moving objects all have unique property information, we propose a method that utilizes deep learning for 3D dynamic environment perception, while taking into account limitations in computing resources. For safer dynamic avoidance, we first model the dynamic target and integrate it into a static grid occupancy map, and then construct a gradient field based on its attribute information. To achieve autonomous UAV flight in dynamic environments, we design an adaptive planning method based on gradient optimisation, which achieves significant computational savings by autonomously adjusting the planning frequency and using manually constructed gradients instead of maintaining a signed distance field (SDF). We have integrated the above approach into a customised quadrotor system and thoroughly tested it in real-world, verifying its flexibility to handle multiple objects with variable speed motion in complex enviroment.
|
| |
| 10:00-11:30, Paper MoAIP-09.11 | Add to My Program |
| Simultaneous Spatial and Temporal Assignment for Fast UAV Trajectory Optimization Using Bilevel Optimization |
|
| Chen, Qianzhong | University of Illinois Urbana-Champaign |
| Cheng, Sheng | University of Illinois Urbana-Champaign |
| Hovakimyan, Naira | University of Illinois at Urbana-Champaign |
Keywords: Constrained Motion Planning, Aerial Systems: Applications, Optimization and Optimal Control
Abstract: In this paper, we propose a framework for fast trajectory planning for unmanned aerial vehicles (UAVs). Our framework is reformulated from an existing bilevel optimization, in which the lower-level problem solves for the optimal trajectory with a fixed time allocation, whereas the upper-level problem updates the time allocation using analytical gradients. The lower-level problem incorporates the safety-set constraints (in the form of inequality constraints) and is cast as a convex quadratic program (QP). Our formulation modifies the lower-level QP by excluding the inequality constraints for the safety sets, which significantly reduces the computation time. The safety-set constraints are moved to the upper-level problem, where the feasible waypoints are updated together with the time allocation using analytical gradients enabled by the OptNet. We validate our approach in simulations, where our method's computation time scales linearly with respect to the number of safety sets, in contrast to the state-of-the-art that scales exponentially.
|
| |
| 10:00-11:30, Paper MoAIP-09.12 | Add to My Program |
| A Non-Prehensile Object Transportation Framework with Adaptive Tilting Based on Quadratic Programming |
|
| Subburaman, Rajesh | University of Naples Federico II |
| Selvaggio, Mario | Universit� Degli Studi Di Napoli Federico II |
| Ruggiero, Fabio | Universit� Di Napoli Federico II |
Keywords: Dexterous Manipulation, Optimization and Optimal Control, Intelligent Transportation Systems
Abstract: This work proposes an operational space control framework for non-prehensile object transportation using a robot arm. The control actions for the manipulator are computed by solving a quadratic programming (QP) problem considering the object's and manipulator's kinematic and dynamic constraints. Given the desired transportation trajectory, the proposed controller generates control commands for the robot to achieve the desired motion whilst preventing object slippage. In particular, the controller minimizes the occurrence of object slippage by adaptively regulating the tray orientation. The proposed approach has been extensively evaluated numerically with a 7-degree-of-freedom manipulator, and it is also verified and validated with a real experimental setup.
|
| |
| 10:00-11:30, Paper MoAIP-09.13 | Add to My Program |
| Dynamic Optimization Fabrics for Motion Generation (I) |
|
| Spahn, Max | TU Delft |
| Wisse, Martijn | Delft University of Technology |
| Alonso-Mora, Javier | Delft University of Technology |
Keywords: Mobile Manipulation, Nonholonomic Motion Planning, Motion Control of Manipulators, Geometric Control
Abstract: Optimization fabrics are a geometric approach to real-time local motion generation, where motions are designed by the composition of several differential equations that exhibit a desired motion behavior. We generalize this framework to dynamic scenarios and non-holonomic robots and prove that fundamental properties can be conserved. We show that convergence to desired trajectories and avoidance of moving obstacles can be guaranteed using simple construction rules of the components. Additionally, we present the first quantitative comparisons between optimization fabrics and model predictive control and show that optimization fabrics can generate similar trajectories with better scalability, and thus, much higher replanning frequency (up to 500 Hz with a 7 degrees of freedom robotic arm). Finally, we present empirical results on several robots, including a non-holonomic mobile manipulator with 10 degrees of freedom and avoidance of a moving human, supporting the theoretical findings.
|
| |
| MoAIP-10 Regular session, Hall E |
Add to My Program |
| Clone of 'Learning for Manipulation I' |
|
| |
| |
| 10:00-11:30, Paper MoAIP-10.1 | Add to My Program |
| Foldsformer: Learning Sequential Multi-Step Cloth Manipulation with Space-Time Attention |
|
| Mo, Kai | Tsinghua University, Shenzhen International Graduate School |
| Xia, Chongkun | Tsinghua University |
| Wang, Xueqian | Center for Artificial Intelligence and Robotics, Graduate School |
| Deng, Yuhong | Tsinghua Univerisity |
| Gao, Xue-Hai | Tsinghua University |
| Liang, Bin | Tsinghua University |
Keywords: Deep Learning in Grasping and Manipulation, Perception-Action Coupling
Abstract: Sequential multi-step cloth manipulation is a challenging problem in robotic manipulation, requiring a robot to perceive the cloth state and plan a sequence of chained actions leading to the desired state. Most previous works address this problem in a goal-conditioned way, and goal observation must be given for each specific task and cloth configuration, which is not practical and efficient. Thus, we present a novel multi-step cloth manipulation planning framework named Foldformer. Foldformer can complete similar tasks with only a general demonstration and utilize a space-time attention mechanism to capture the instruction information behind this demonstration. We experimentally evaluate Foldsformer on four representative sequential multi-step manipulation tasks and show that Foldsformer significantly outperforms state-of-the-art approaches in simulation. Foldformer can complete multi-step cloth manipulation tasks even when configurations of the cloth (e.g., size and pose) vary from configurations in the general demonstrations. Furthermore, our approach can be transferred from simulation to the real world without additional training or domain randomization. Despite training on rectangular clothes, we also show that our approach can generalize to unseen cloth shapes (T-shirts and shorts). Videos are available at https://sites.google.com/view/foldsformer.
|
| |
| 10:00-11:30, Paper MoAIP-10.2 | Add to My Program |
| GraNet: A Multi-Level Graph Network for 6-DoF Grasp Pose Generation in Cluttered Scenes |
|
| Wang, Haowen | Shanghai Jiao Tong University |
| Niu, Wanhao | Shanghai Jiao Tong University |
| Zhuang, Chungang | Shanghai Jiao Tong University |
Keywords: Deep Learning in Grasping and Manipulation, Perception for Grasping and Manipulation, Computer Vision for Automation
Abstract: 6-DoF object-agnostic grasping in unstructured environments is a critical yet challenging task in robotics. Most current works use non-optimized approaches to sample grasp locations and learn spatial features without concerning the grasping task. This paper proposes GraNet, a graph-based grasp pose generation framework that translates a point cloud scene into multi-level graphs and propagates features through graph neural networks. By building graphs at the scene level, object level, and grasp point level, GraNet enhances feature embedding at multiple scales while progressively converging to the ideal grasping locations by learning. Our pipeline can thus characterize the spatial distribution of grasps in cluttered scenes, leading to a higher rate of effective grasping. Furthermore, we enhance the representation ability of scalable graph networks by a structure-aware attention mechanism to exploit local relations in graphs. Our method achieves state-of-the-art performance on the large-scale GraspNet-1Billion benchmark, especially in grasping unseen objects (+11.62 AP). The real robot experiment shows a high success rate in grasping scattered objects, verifying the effectiveness of the proposed approach in unstructured environments.
|
| |
| 10:00-11:30, Paper MoAIP-10.3 | Add to My Program |
| Modular Neural Network Policies for Learning In-Flight Object Catching with a Robot Hand-Arm System |
|
| Hu, Wenbin | University of Edinburgh |
| Acero, Fernando | University of Edinburgh |
| Triantafyllidis, Eleftherios | The University of Edinburgh |
| Liu, Zhaocheng | The University of Edinburgh |
| Li, Zhibin (Alex) | University College London |
Keywords: Deep Learning in Grasping and Manipulation, Reinforcement Learning, Perception-Action Coupling
Abstract: We present a modular framework designed to enable a robot hand-arm system to learn how to catch flying objects, a task that requires fast, reactive, and accurately-timed robot motions. Our framework consists of five core modules: (i) an object state estimator that learns object trajectory prediction, (ii) a catching pose quality network that learns to score and rank object poses for catching, (iii) a reaching control policy trained to move the robot hand to pre-catch poses, (iv) a grasping control policy trained to perform soft catching motions for safe and robust grasping, and (v) a gating network trained to synthesize the actions given by the reaching and grasping policy. The former two modules are trained via supervised learning and the latter three use deep reinforcement learning in a simulated environment. We conduct extensive evaluations of our framework in simulation for each module and the integrated system, to demonstrate high success rates of in-flight catching and robustness to perturbations and sensory noise. Whilst only simple cylindrical and spherical objects are used for training, the integrated system shows successful generalization to a variety of household objects that are not used in training.
|
| |
| 10:00-11:30, Paper MoAIP-10.4 | Add to My Program |
| GVCCI: Lifelong Learning of Visual Grounding for Language-Guided Robotic Manipulation |
|
| Kim, Junghyun | Seoul National University |
| Kang, Gi-Cheon | Seoul National University |
| Kim, Jaein | Seoul National University |
| Shin, Suyeon | Seoul National University |
| Zhang, Byoung-Tak | Seoul National University |
Keywords: Multi-Modal Perception for HRI, Deep Learning Methods, Autonomous Agents
Abstract: Language-Guided Robotic Manipulation (LGRM) is a challenging task as it requires a robot to understand human instructions to manipulate everyday objects. Recent approaches in LGRM rely on pre-trained Visual Grounding (VG) models to detect objects without adapting to manipulation environments. This results in a performance drop due to a substantial domain gap between the pre-training and real-world data. A straightforward solution is to collect additional training data, but the cost of human-annotation is extortionate. In this paper, we propose Grounding Vision to Ceaselessly Created Instructions (GVCCI), a lifelong learning framework for LGRM, which continuously learns VG without human supervision. GVCCI iteratively generates synthetic instruction via object detection and trains the VG model with the generated data. We validate our framework in offline and online settings across diverse environments on different VG models. Experimental results show that accumulating synthetic data from GVCCI leads to a steady improvement in VG by up to 56.7% and improves resultant LGRM by up to 29.4%. Furthermore, the qualitative analysis shows that the unadapted VG model often fails to find correct objects due to a strong bias learned from the pre-training data. Finally, we introduce a novel VG dataset for LGRM, consisting of nearly 252k triplets of image-object-instruction from diverse manipulation environments.
|
| |
| 10:00-11:30, Paper MoAIP-10.5 | Add to My Program |
| Bag All You Need: Learning a Generalizable Bagging Strategy for Heterogeneous Objects |
|
| Bahety, Arpit | Columbia University |
| Jain, Shreeya | Columbia University |
| Ha, Huy | Columbia University |
| Hager, Nathalie | Columbia University |
| Burchfiel, Benjamin | Toyota Research Institute |
| Cousineau, Eric | Toyota Research Institute |
| Feng, Siyuan | Toyota Research Institute |
| Song, Shuran | Columbia University |
Keywords: Deep Learning in Grasping and Manipulation, Manipulation Planning, Service Robotics
Abstract: We introduce a practical robotics solution for the task of heterogeneous bagging, requiring the placement of multiple rigid and deformable objects into a deformable bag. This is a difficult task as it features complex interactions between multiple highly deformable objects under limited observability. To tackle these challenges, we propose a robotic system consisting of two learned policies: a rearrangement policy that learns to place multiple rigid objects and fold deformable objects in order to achieve desirable pre-bagging conditions, and a lifting policy to infer suitable grasp points for bi-manual bag lifting. We evaluate these learned policies on a real-world three-arm robot platform that achieves a 70% heterogeneous bagging success rate with novel objects. To facilitate future research and comparison, we also develop a novel heterogeneous bagging simulation benchmark that will be made publicly available.
|
| |
| 10:00-11:30, Paper MoAIP-10.6 | Add to My Program |
| Multi-Source Fusion for Voxel-Based 7-DoF Grasping Pose Estimation |
|
| Qiu, Junning | Xi'an Jiaotong University |
| Wang, Fei | Xi'an Jiaotong University |
| Dang, Zheng | EPFL |
Keywords: Deep Learning in Grasping and Manipulation, Visual Learning, Deep Learning Methods
Abstract: In this work, we tackle the problem of 7-DoF grasping pose estimation(6-DoF with the opening width of parallel-jaw gripper) from point cloud data, which is a fundamental task in robotic manipulation. Most existing methods adopt 3D voxel CNNs as the backbone for their efficiency in handling unordered point cloud data. However, we found that these approaches overlook detailed information of the point clouds, resulting in decreased performance. Through our analysis, we identified quantization loss and boundary information loss within 3D convolutional layers as the primary causes of this issue. To address these challenges, we introduced two novel branches: one adds an extra positional encoding operation to preserve details and unique features for each point, and the other uses a 2D CNN to operate on the rangebased image, which better aggregates boundary information on a continuous 2D domain. To integrate these branches with the original branch, we introduced a novel multi-source fusion gated mechanism to aggregate features. Our approach achieved state-of-the-art performance on the Graspnet-1Billion benchmark and demonstrated high success rates in real robotic experiments across different scenes. Our work has the potential to improve the performance of robotic grasping systems and contribute to the field of robotics.
|
| |
| 10:00-11:30, Paper MoAIP-10.7 | Add to My Program |
| VL-Grasp: A 6-Dof Interactive Grasp Policy for Language-Oriented Objects in Cluttered Indoor Scenes |
|
| Lu, Yuhao | Tsinghua University |
| Fan, Yixuan | Tsinghua University |
| Deng, Beixing | Tsinghua University |
| Liu, Fangfu | Tsinghua University |
| Li, Yali | Tsinghua University |
| Wang, Shengjin | Tsinghua University |
Keywords: Deep Learning in Grasping and Manipulation, Multi-Modal Perception for HRI, Data Sets for Robotic Vision
Abstract: Robotic grasping faces new challenges in human-robot-interaction scenarios. We consider the task that the robot grasps a target object designated by human's language directives. The robot not only needs to locate a target based on vision-and-language information, but also needs to predict the reasonable grasp pose candidate at various views and postures. In this work, we propose a novel interactive grasp policy, named Visual-Lingual-Grasp (VL-Grasp), to grasp the target specified by human language. First, we build a new challenging visual grounding dataset to provide functional training data for robotic interactive perception in indoor environments. Second, we propose a 6-Dof interactive grasp policy combined with visual grounding and 6-Dof grasp pose detection to extend the universality of interactive grasping. Third, we design a grasp pose filter module to enhance the performance of the policy. Experiments demonstrate the effectiveness and extendibility of the VL-Grasp in real world. The VL-Grasp achieves a success rate of 72.5% in different indoor scenes. The code and dataset is available at https://github.com/luyh20/VL-Grasp.
|
| |
| 10:00-11:30, Paper MoAIP-10.8 | Add to My Program |
| QDP: Learning to Sequentially Optimise Quasi-Static and Dynamic Manipulation Primitives for Robotic Cloth Manipulation |
|
| Blanco-Mulero, David | Aalto University |
| Alcan, Gokhan | Aalto University |
| Abu-Dakka, Fares | Technische Universit�t M�nchen |
| Kyrki, Ville | Aalto University |
Keywords: Deep Learning in Grasping and Manipulation, Reinforcement Learning, Manipulation Planning
Abstract: Pre-defined manipulation primitives are widely used for cloth manipulation. However, cloth properties such as its stiffness or density can highly impact the performance of these primitives. Although existing solutions have tackled the parameterisation of pick and place locations, the effect of factors such as the velocity or trajectory of quasi-static and dynamic manipulation primitives has been neglected. Choosing appropriate values for these parameters is crucial to cope with the range of materials present in house-hold cloth objects. To address this challenge, we introduce the Quasi-Dynamic Parameterisable (QDP) method, which optimises parameters such as the motion velocity in addition to the pick and place positions of quasi-static and dynamic manipulation primitives. In this work, we leverage the framework of Sequential Reinforcement Learning to decouple sequentially the parameters that compose the primitives. To evaluate the effectiveness of the method, we focus on the task of cloth unfolding with a robotic arm in simulation and real-world experiments. Our results in simulation show that by deciding the optimal parameters for the primitives the performance can improve by 20% compared to sub-optimal ones. Real-world results demonstrate the advantage of modifying the velocity and height of manipulation primitives for cloths with different mass, stiffness, shape, and size. Supplementary material, videos, and code, can be found at https://sites.google.com/view/qdp-srl.
|
| |
| 10:00-11:30, Paper MoAIP-10.9 | Add to My Program |
| Robust Visual Sim-To-Real Transfer for Robotic Manipulation |
|
| Garcia, Ricardo | Inria |
| Strudel, Robin | INRIA Paris |
| Chen, Shizhe | Inria |
| Arlaud, Etienne | INRIA |
| Laptev, Ivan | INRIA |
| Schmid, Cordelia | Inria |
Keywords: Deep Learning in Grasping and Manipulation, Learning from Demonstration, Transfer Learning
Abstract: Learning visuomotor policies in simulation is much safer and cheaper than in the real world. However, due to discrepancies between the simulated and real data, simulator-trained policies often fail when transferred to real robots. One common approach to bridge the visual sim-to-real domain gap is domain randomization (DR). While previous work mainly evaluates DR for disembodied tasks, such as pose estimation and object detection, here we systematically explore visual domain randomization methods and benchmark them on a rich set of challenging robotic manipulation tasks. In particular, we propose an off-line proxy task of cube localization to select DR parameters for texture randomization, lighting randomization, variations of object colors and camera parameters. Notably, we demonstrate that DR parameters have similar impact on our off-line proxy task and on-line policies. We, hence, use off-line optimized DR parameters to train visuomotor policies in simulation and directly apply such policies to a real robot. Our approach achieves 93% success rate on average when tested on a diverse set of challenging manipulation tasks. Moreover, we evaluate the robustness of policies to visual variations in real scenes and show that our simulator-trained policies outperform policies learned using real but limited data. Code, simulation environment, real robot datasets and trained models are available at https://www.di.ens.fr/willow/research/robust_s2r/.
|
| |
| 10:00-11:30, Paper MoAIP-10.10 | Add to My Program |
| Multi-Dimensional Deformable Object Manipulation Using Equivariant Models |
|
| Fu, Tianyu | East China University of Science and Technology |
| Tang, Yang | East China University of Science and Technology |
| Wu, Tianyu | East China University of Science and Technology |
| Xia, Xiaowu | East China University of Science and Technology |
| Wang, Jianrui | East China University of Science and Technology |
| Zhao, Chaoqiang | East China University of Science and Technology |
Keywords: Deep Learning in Grasping and Manipulation, Learning from Demonstration, Imitation Learning
Abstract: Manipulating deformable objects, such as ropes (1D), fabrics (2D), and bags (3D), is a very challenging problem in robotic research since the deformable object has a high degree of freedom in the physical state and nonlinear dynamics. Compared with single-dimensional deformable objects, multi-dimensional object manipulation suffers from the difficulty in recognizing the characteristics of the object correctly and making an accurate action decision on the deformable object of various dimensions.Some methods are proposed to use neural networks to rearrange deformable objects in all dimensions, but their approaches are not accurate in predicting the motion of the robot as they just consider the equivariance in the picking objects. To address this problem, we present a novel Transporter Network encoded and decoded with equivariance to generalize to different picking and placing positions. Additionally, we propose an equivariant goal-conditioned model to enable the robot to manipulate deformable objects into flexible configurations without artificially marked visual anchors for the target position. Finally, experiments in Deformable-Ravens and the real world demonstrate that our equivariant models are more sample efficient than the traditional Transporter Network. The video is available at https://youtu.be/SH4aV2f0wt0.
|
| |
| 10:00-11:30, Paper MoAIP-10.11 | Add to My Program |
| Adversarial Object Rearrangement in Constrained Environments with Heterogeneous Graph Neural Networks |
|
| Lou, Xibai | University of Minnesota Twin Cities |
| Yu, Houjian | University of Minnesota, Twin Cities |
| Worobel, Ross | University of Minnesota |
| Yang, Yang | University of Minnesota |
| Choi, Changhyun | University of Minnesota, Twin Cities |
Keywords: Deep Learning in Grasping and Manipulation, Deep Learning for Visual Perception, Task and Motion Planning
Abstract: Adversarial object rearrangement in the real world (e.g., previously unseen or oversized items in kitchens and stores) could benefit from understanding task scenes, which inherently entail heterogeneous components such as current objects, goal objects, and environmental constraints. The semantic relationships among these components are distinct from each other and crucial for multi-skilled robots to perform efficiently in everyday scenarios. We propose a hierarchical robotic manipulation system that learns the underlying relationships and maximizes the collaborative power of its diverse skills (e.g., pick-place, push) for rearranging adversarial objects in constrained environments. The high-level coordinator employs a heterogeneous graph neural network (HetGNN), which reasons about the current objects, goal objects, and environmental constraints; the low-level 3D Convolutional Neural Network-based actors execute the action primitives. Our approach is trained entirely in simulation, and achieved an average success rate of 87.88% and a planning cost of 12.82 in real-world experiments, surpassing all baseline methods. Supplementary material is available at https://sites.google.com/umn.edu/versatile-rearrangement.
|
| |
| 10:00-11:30, Paper MoAIP-10.12 | Add to My Program |
| Probabilistic Slide-Support Manipulation Planning in Clutter |
|
| Shusei, Nagato | Osaka University |
| Motoda, Tomohiro | National Institute of Advanced Industrial Science and Technology |
| Nishi, Takao | Osaka University |
| Petit, Damien | Osaka University |
| Kiyokawa, Takuya | Osaka University |
| Wan, Weiwei | Osaka University |
| Harada, Kensuke | Osaka University |
Keywords: Deep Learning in Grasping and Manipulation, Bimanual Manipulation, Manipulation Planning
Abstract: To safely and efficiently extract an object from the clutter, this paper presents a bimanual manipulation planner in which one hand of the robot is used to slide the target object out of the clutter while the other hand is used to support the surrounding objects to prevent the clutter from collapsing. Our method uses a neural network to predict the physical phenomena of the clutter when the target object is moved. We generate the most efficient action based on the Monte Carlo tree search.The grasping and sliding actions are planned to minimize the number of motion sequences to pick the target object. In addition, the object to be supported is determined to minimize the position change of surrounding objects. Experiments with a real bimanual robot confirmed that the robot could retrieve the target object, reducing the total number of motion sequences and improving safety.
|
| |
| 10:00-11:30, Paper MoAIP-10.13 | Add to My Program |
| GOATS: Goal Sampling Adaptation for Scooping with Curriculum Reinforcement Learning |
|
| Niu, Yaru | Carnegie Mellon University |
| Jin, Shiyu | Baidu |
| Zhang, Zeqing | The University of Hong Kong |
| Zhu, Jiacheng | Carnegie Mellon University |
| Zhao, Ding | Carnegie Mellon University |
| Zhang, Liangjun | Baidu |
Keywords: Deep Learning in Grasping and Manipulation, Reinforcement Learning
Abstract: In this work, we first formulate the problem of robotic water scooping using goal-conditioned reinforcement learning. This task is particularly challenging due to the complex dynamics of fluid and the need to achieve multi-modal goals. The policy is required to successfully reach both position goals and water amount goals, which leads to a large convoluted goal state space. To overcome these challenges, we introduce Goal Sampling Adaptation for Scooping (GOATS), a curriculum reinforcement learning method that can learn an effective and generalizable policy for robot scooping tasks. Specifically, we use a goal-factorized reward formulation and interpolate position goal distributions and amount goal distributions to create curriculum throughout the learning process. As a result, our proposed method can outperform the baselines in simulation and achieves 5.46% and 8.71% amount errors on bowl scooping and bucket scooping tasks, respectively, under 1000 variations of initial water states in the tank and a large goal state space. Besides being effective in simulation environments, our method can efficiently adapt to noisy real-robot water-scooping scenarios with diverse physical configurations and unseen settings, demonstrating superior efficacy and generalizability. The videos of this work are available on our project page: https://sites.google.com/view/goatscooping.
|
| |
| MoAIP-11 Regular session, Hall E |
Add to My Program |
| Clone of 'Aerial Systems - Applications I' |
|
| |
| |
| 10:00-11:30, Paper MoAIP-11.1 | Add to My Program |
| Auto Filmer: Autonomous Aerial Videography under Human Interaction |
|
| Zhang, Zhiwei | Zhejiang University |
| Zhong, Yuhang | NanKai Unviersity |
| Guo, Junlong | Zhejiang University |
| Wang, Qianhao | Zhejiang University |
| Xu, Chao | Zhejiang University |
| Gao, Fei | Zhejiang University |
Keywords: Aerial Systems: Applications, Human-Aware Motion Planning, Aerial Systems: Perception and Autonomy
Abstract: The advance of unmanned aerial vehicles (UAVs) has enabled customers and directors to film from the air. However, operating the drone to produce desired videos upon a moving object is hard to achieve. This letter proposes an autonomous aerial videography system that integrates customized shots and drone dynamics. We design a user-friendly interface for the operator to create the desired shot in real-time. The shot information is then transmitted to the kinodynamic path search process, in which a safe shooting path will be evaluated. Later, feasible regions and safe flight corridors are constructed for safety and visibility. Finally, a joint optimization is carried out to generate the trajectory of the quadrotor and the gimbal to maintain the required image composition. Extensive simulation and real-world experiments validate the effectiveness of our method.
|
| |
| 10:00-11:30, Paper MoAIP-11.2 | Add to My Program |
| New Era in Cultural Heritage Preservation: Cooperative Aerial Autonomy for Fast Digitalization of Difficult-To-Access Interiors of Historical Monuments (I) |
|
| Petr�ček, Pavel | Czech Technical University in Prague |
| Kr�tk�, V�t | Czech Technical University in Prague |
| Baca, Tomas | Ceske Vysoke Uceni Technicke V Praze, FEL |
| Petrlik, Matej | Czech Technical University in Prague, Faculty of Electrical Engi |
| Saska, Martin | Czech Technical University in Prague |
Keywords: Aerial Systems: Applications, Aerial Systems: Perception and Autonomy, Multi-Robot Systems
Abstract: Digital documentation of large interiors of historical buildings is an exhausting task since most of the areas of interest are beyond typical human reach. We advocate the use of autonomous teams of multi-rotor Unmanned Aerial Vehicles (UAVs) to speed up the documentation process by several orders of magnitude while allowing for a repeatable, accurate, and condition-independent solution capable of precise collision-free operation at great heights. The proposed multi-robot approach allows for performing tasks requiring dynamic scene illumination in large-scale real-world scenarios, a process previously applicable only in small-scale laboratory-like conditions. Extensive experimental analyses range from single-UAV imaging to specialized lighting techniques requiring accurate coordination of multiple UAVs. The system�s robustness is demonstrated in more than two hundred autonomous flights in fifteen historical monuments requiring superior safety while lacking access to external localization. This unique experimental campaign, cooperated with restorers and conservators, brought numerous lessons transferable to other safety-critical robotic missions in documentation and inspection tasks.
|
| |
| 10:00-11:30, Paper MoAIP-11.3 | Add to My Program |
| Tight Collision Probability for UAV Motion Planning in Uncertain Environment |
|
| Liu, Tianyu | The University of Hong Kong |
| Zhang, Fu | University of Hong Kong |
| Gao, Fei | Zhejiang University |
| Pan, Jia | University of Hong Kong |
Keywords: Aerial Systems: Applications, Motion and Path Planning, Collision Avoidance
Abstract: Operating unmanned aerial vehicles (UAVs) in complex environments that feature dynamic obstacles and external disturbances poses significant challenges, primarily due to the inherent uncertainty in such scenarios. Additionally, inaccurate robot localization and modeling errors further exacerbate these challenges. Recent research on UAV motion planning in static environments has been unable to cope with the rapidly changing surroundings, resulting in trajectories that may not be feasible. Moreover, previous approaches that have addressed dynamic obstacles or external disturbances in isolation are insufficient to handle the complexities of such environments. This paper proposes a reliable motion planning framework for UAVs, integrating various uncertainties into a chance constraint that characterizes the uncertainty in a probabilistic manner. The chance constraint provides a probabilistic safety certificate by calculating the collision probability between the robot's Gaussian-distributed forward reachable set and states of obstacles. To reduce the conservatism of the planned trajectory, we propose a tight upper bound of the collision probability and evaluate it both exactly and approximately. The approximated solution is used to generate motion primitives as a reference trajectory, while the exact solution is leveraged to iteratively optimize the trajectory for better results. Our method is thoroughly tested in simulation and real-world experiments, verifying its reliability and effectiveness in uncertain environments.
|
| |
| 10:00-11:30, Paper MoAIP-11.4 | Add to My Program |
| Dodging Like a Bird: An Inverted Dive Maneuver Taking by Lifting-Wing Multicopters |
|
| Gao, Wenhan | Beihang University |
| Wang, Shuai | Beihang University |
| Quan, Quan | Beihang University |
Keywords: Aerial Systems: Applications, Motion and Path Planning
Abstract: It is crucial for hybrid unmanned aerial vehicles, such as lifting-wing multicopters, to plan a continuous, smooth, and collision-free trajectory to avoid obstacles. Unlike quadcopters, which typically work in indoor environments, lifting-wing multicopters typically fly at a high altitude with a high cruising speed, requiring higher maneuverability in the vertical direction. Inspired by birds, lifting-wing multicopters can take an inverted flight maneuver to gain more maneuverability than the corresponding multicopter owing to the additional lifting wing. In this paper, a rotation-aware collision-free motion planning strategy is proposed that takes aerodynamics into consideration and allows lifting-wing multicopters to fly at large rotation angles, even in inverted postures. Specifically, a collision-free state sequence is found using rotation-aware primitives by solving a graph search problem. The sequence is then refined with B-spline into smooth trajectories to be tracked by the differential flatness-based controller for lifting-wing multicopters. We analyze the proposed motion planning algorithm in different scenarios and demonstrate the feasibility of the generated trajectories in simulation and real-world experiments.
|
| |
| 10:00-11:30, Paper MoAIP-11.5 | Add to My Program |
| Model-Based Planning and Control for Terrestrial-Aerial Bimodal Vehicles with Passive Wheels |
|
| Zhang, Ruibin | Zhejiang University |
| Lin, Junxiao | Zhejiang University |
| Wu, Yuze | Zhejiang University |
| Gao, Yuman | Zhejiang University |
| Wang, Chi | Zhejiang University |
| Xu, Chao | Zhejiang University |
| Cao, Yanjun | Zhejiang University, Huzhou Institute of Zhejiang University |
| Gao, Fei | Zhejiang University |
Keywords: Aerial Systems: Applications, Motion and Path Planning, Motion Control
Abstract: Terrestrial and aerial bimodal vehicles have gained widespread attention due to their cross-domain maneuverability. Nevertheless, their bimodal dynamics significantly increase the complexity of motion planning and control, thus hindering robust and efficient autonomous navigation in unknown environments. To resolve this issue, we develop a model-based planning and control framework for terrestrial aerial bimodal vehicles. This work begins by deriving a unified dynamic model and the corresponding differential flatness. Leveraging differential flatness, an optimization-based trajectory planner is proposed, which takes into account both solution quality and computational efficiency. Moreover, we design a tracking controller using nonlinear model predictive control based on the proposed unified dynamic model to achieve accurate trajectory tracking and smooth mode transition. We validate our framework through extensive benchmark comparisons and experiments, demonstrating its effectiveness in terms of planning quality and control performance.
|
| |
| 10:00-11:30, Paper MoAIP-11.6 | Add to My Program |
| Polynomial-Based Online Planning for Autonomous Drone Racing in Dynamic Environments |
|
| Wang, Qianhao | Zhejiang University |
| Wang, Dong | Zhejiang University |
| Xu, Chao | Zhejiang University |
| Gao, Alan | Fan'gang |
| Gao, Fei | Zhejiang University |
Keywords: Aerial Systems: Applications, Motion and Path Planning, Task and Motion Planning
Abstract: In recent years, there is a noteworthy advancement in autonomous drone racing. However, the primary focus is on attaining execution times, while scant attention is given to the challenges of dynamic environments. The high-speed nature of racing scenarios, coupled with the potential for unforeseeable environmental alterations, present stringent requirements for online replanning and its timeliness. For racing in dynamic environments, we propose an online replanning framework with an efficient polynomial trajectory representation. We trade off between aggressive speed and flexible obstacle avoidance based on an optimization approach. Additionally, to ensure safety and precision when crossing intermediate racing waypoints, we formulate the demand as hard constraints during planning. For dynamic obstacles, parallel multi-topology trajectory planning is designed based on engineering considerations to prevent racing time loss due to local optimums. The framework is integrated into a quadrotor system and successfully demonstrated at the DJI Robomaster Intelligent UAV Championship, where it successfully complete the racing track and placed first, finishing in less than half the time of the second-place.
|
| |
| 10:00-11:30, Paper MoAIP-11.7 | Add to My Program |
| Autonomous Power Line Inspection with Drones Via Perception-Aware MPC |
|
| Xing, Jiaxu | ETH Zurich |
| Cioffi, Giovanni | University of Zurich |
| Hidalgo Carrio, Javier | University of Zurich and ETH Zurich |
| Scaramuzza, Davide | University of Zurich |
Keywords: Aerial Systems: Applications, Aerial Systems: Perception and Autonomy
Abstract: Drones have the potential to revolutionize power line inspection by increasing productivity, reducing inspection time, improving data quality, and eliminating the risks for human operators. Current state-of-the-art systems for power line inspection have two shortcomings: (i) control is decoupled from perception and needs accurate information about the location of the power lines and masts; (ii) obstacle avoidance is decoupled from the power line tracking, which results in poor tracking in the vicinity of the power masts, and, consequently, in decreased data quality for visual inspection. In this work, we propose a model predictive controller (MPC) that overcomes these limitations by tightly coupling perception and action. Our controller generates commands that maximize the visibility of the power lines while, at the same time, safely avoiding the power masts. For power line detection, we propose a lightweight learning-based detector that is trained only on synthetic data and is able to transfer zero-shot to real-world power line images. We validate our system in simulation and real-world experiments on a mock-up power line infrastructure. We release our code and datasets to the public.
|
| |
| 10:00-11:30, Paper MoAIP-11.8 | Add to My Program |
| A Perching and Tilting Aerial Robot for Precise and Versatile Power Tool Work on Vertical Walls |
|
| Dautzenberg, Roman | ETH Z�rich |
| K�ster, Timo | ETH Z�rich |
| Mathis, Timon | ETH Z�rich |
| Roth, Yann | ETH Z�rich |
| Steinauer, Curdin | ETH Z�rich |
| K�ppeli, Gabriel | ETH Z�rich |
| Santen, Julian | ETH Z�rich |
| Arranhado, Alina | ETH Z�rich |
| Biffar, Friederike | ETH Z�rich |
| K�tter, Till | ETH Z�rich |
| Lanegger, Christian | ETH Zurich |
| Allenspach, Mike | ETH Z�rich |
| Siegwart, Roland | ETH Zurich |
| B�hnemann, Rik | ETH Z�rich |
Keywords: Aerial Systems: Applications, Robotics and Automation in Construction, Actuation and Joint Mechanisms
Abstract: Drilling, grinding, and setting anchors on vertical walls are fundamental processes in everyday construction work. Manually doing these works is error-prone, potentially dangerous, and elaborate at height. Today, heavy mobile ground robots can perform automatic power tool work. However, aerial vehicles could be deployed in untraversable environments and reach inaccessible places. Existing drone designs do not provide the large forces, payload, and high precision required for using power tools. This work presents the first aerial robot design to perform versatile manipulation tasks on vertical concrete walls with continuous forces of up to 150 N. The platform combines a quadrotor with active suction cups for perching on walls and a lightweight, tiltable linear tool table. This combination minimizes weight using the propulsion system for flying, surface alignment, and feed during manipulation and allows precise positioning of the power tool. We evaluate our design in a concrete drilling application - a challenging construction process that requires high forces, accuracy, and precision. In 30 trials, our design can accurately pinpoint a target position despite perching imprecision. Nine visually guided drilling experiments demonstrate a drilling precision of 6 mm without further automation. Aside from drilling, we also demonstrate the versatility of the design by setting an anchor into concrete.
|
| |
| 10:00-11:30, Paper MoAIP-11.9 | Add to My Program |
| Resource-Constrained Station-Keeping for Latex Balloons Using Reinforcement Learning |
|
| Saunders, Jack | University of Bath |
| Prenevost, Lo�c | Lux Aerobot |
| Şimşek, �zg�r | University of Bath |
| Hunter, Alan Joseph | University of Bath |
| Li, Wenbin | University of Bath |
Keywords: Aerial Systems: Applications, Machine Learning for Robot Control, Reinforcement Learning
Abstract: High altitude balloons have proved useful for ecological aerial surveys, atmospheric monitoring, and communication relays. However, due to weight and power constraints, there is a need to investigate alternate modes of propulsion to navigate in the stratosphere. Very recently, reinforcement learning has been proposed as a control scheme to maintain balloons in the region of a fixed location, facilitated through diverse opposing wind-fields at different altitudes. Although air-pump based station keeping has been explored, there is no research on the control problem for venting and ballasting actuated balloons, which is commonly used as a low-cost alternative. We show how reinforcement learning can be used for this type of balloon. Specifically, we use the soft actor-critic algorithm, which on average is able to station-keep within 50 km for on average 25% of the flight, consistent with state-of-the-art. Furthermore, we show that the proposed controller effectively minimises the consumption of resources, thereby supporting long duration flights. We frame the controller as a continuous control reinforcement learning problem, which allows for a more diverse range of trajectories, as opposed to current state-of-the-art work, which uses discrete action spaces. Furthermore, through continuous control, we can make use of larger ascent rates which are not possible using air-pumps. The desired ascent-rate is decoupled into desired altitude and time-factor to provide a more transparent policy, compared to low-level control commands used in previous works. Finally, by applying the equations of motion, we establish appropriate thresholds for venting and ballasting to prevent the agent from exploiting the environment. More specifically, we ensure actions are physically feasible by enforcing constraints on venting and ballasting.
|
| |
| 10:00-11:30, Paper MoAIP-11.10 | Add to My Program |
| A Light-Weight, Low-Cost, and Sustainable Planning System for UAVs Using a Local Map Origin Update Approach |
|
| Lee, Dasol | Agency for Defense Development |
| La, Jinche | Agency for Defense Development |
| Joo, Sanghyun | Agency for Defense Development |
Keywords: Aerial Systems: Applications, Motion and Path Planning
Abstract: This paper proposes a sustainable planning system for small-sized unmanned aerial vehicles (UAVs). Our mapping module of the system uses a voxel array as data structure with an introduced feature which is local map origin update. This approach has clear advantages that the planning system can sustainably plan trajectories regardless of operating radius and flight distance, and it shows fastest invariant time complexity O(1) unlike other representation methods. Also, we propose an efficient configuration space (C-space) construction algorithm using incremental voxel inflation, and extend state-of-the-art Euclidean signed distance field (ESDF) algorithm, FIESTA by applying the local map origin update feature. The proposed planning system requires single depth camera only as a sensor, and can operate in real-time on embedded computing platforms. We have verified the planning system through real-world flight tests in dense environments using a light-weight quadrotor platform under 300 mm size equipped with low-cost components only.
|
| |
| 10:00-11:30, Paper MoAIP-11.11 | Add to My Program |
| Bubble Explorer: Fast UAV Exploration in Large-Scale and Cluttered 3D-Environments Using Occlusion-Free Spheres |
|
| Tang, Benxu | The University of Hong Kong |
| Ren, Yunfan | The University of Hong Kong |
| Zhu, Fangcheng | The University of Hong Kong |
| He, Rui | The University of Hong Kong |
| Liang, Siqi | Harbin Institute of Technology, Shenzhen |
| Kong, Fanze | The University of Hong Kong |
| Zhang, Fu | University of Hong Kong |
Keywords: Aerial Systems: Applications, Aerial Systems: Perception and Autonomy, Motion and Path Planning
Abstract: Autonomous exploration is a crucial aspect of robotics that has numerous applications. Most of the existing methods greedily choose goals that maximize immediate reward. This strategy is computationally efficient but insufficient for overall exploration efficiency. In recent years, some state-of-the-art methods are proposed, which generate a global coverage path and significantly improve overall exploration efficiency. However, global optimization produces high computational overhead, leading to low-frequency planner updates and inconsistent planning motion. In this work, we propose a novel method to support fast UAV exploration in large-scale and cluttered 3-D environments. We introduce a computationally lowcost viewpoints generation method using occlusion-free spheres. Additionally, we combine greedy strategy with global optimization, which considers both computational and exploration efficiency. We benchmark our method against state-of-the-art methods to showcase its superiority in terms of exploration efficiency and computational time. We conduct various real-world experiments to demonstrate the excellent performance of our method in large-scale and cluttered environments.
|
| |
| 10:00-11:30, Paper MoAIP-11.12 | Add to My Program |
| UPPLIED: UAV Path Planning for Inspection through Demonstration |
|
| Kannan, Shyam Sundar | Purdue University |
| Venkatesh, L.N Vishnunandan | Purdue University |
| Senthilkumaran, Revanth Krishna | Purdue University |
| Min, Byung-Cheol | Purdue University |
Keywords: Aerial Systems: Applications
Abstract: In this paper, a new demonstration-based path-planning framework for the visual inspection of large structures using UAVs is proposed. We introduce UPPLIED: UAV Path PLanning for InspEction through Demonstration, which utilizes a demonstrated trajectory to generate a new trajectory to inspect other structures of the same kind. The demonstrated trajectory can inspect specific regions of the structure and the new trajectory generated by UPPLIED inspects similar regions in the other structure. The proposed method generates inspection points from the demonstrated trajectory and uses standardization to translate those inspection points to inspect the new structure. Finally, the position of these inspection points is optimized to refine their view. Numerous experiments were conducted with various structures and the proposed framework was able to generate inspection trajectories of various kinds for different structures based on the demonstration. The trajectories generated match with the demonstrated trajectory in geometry and at the same time inspect the regions inspected by the demonstration trajectory with minimum deviation. The experimental video of the work can be found at https://youtu.be/YqPx-cLkv04.
|
| |
| 10:00-11:30, Paper MoAIP-11.13 | Add to My Program |
| Learning Fluid Flow Visualizations from In-Flight Images with Tufts |
|
| Lee, Jongseok | German Aerospace Center |
| Olsman, Jurrien | German Aerospace Center (DLR) |
| Triebel, Rudolph | German Aerospace Center (DLR) |
Keywords: Aerial Systems: Applications, Computer Vision for Automation, Object Detection, Segmentation and Categorization
Abstract: For better understanding of fluid flows around aerial systems, strips of wire or rope, widely known as tufts, are often used to visualize the local flow direction. This paper presents a computer vision system that automatically extracts the shape of tufts from images, which have been collected during real flights of a helicopter and an unmanned aerial vehicle (UAV). As images from these aerial systems present challenges to both the model-based computer vision and the end-to-end supervised deep learning techniques, we propose a semantic segmentation pipeline that consists of three uncertainty-based modules namely, (a) active learning for object detection, (b) label propagation for object classification, and (c) weakly supervised instance segmentation. Overall, these probabilistic approaches facilitate the learning process without requiring any manual annotations of semantic segmentation masks. Empirically, we motivate our design choices through comparative assessments and provide real world demonstrations of the proposed concept, for the first time to our knowledge. The project website found at https://sites.google.com/view/tuftrecognition.
|
| |
| 10:00-11:30, Paper MoAIP-11.14 | Add to My Program |
| Fully Autonomous Brick Pick-And-Place in Fields by Articulated Aerial Robot (I) |
|
| Anzai, Tomoki | The University of Tokyo |
| Zhao, Moju | The University of Tokyo |
| Nishio, Takuzumi | The University of Tokyo |
| Shi, Fan | ETH Z�rich |
| Okada, Kei | The University of Tokyo |
| Inaba, Masayuki | The University of Tokyo |
Keywords: Aerial Systems: Applications, Field Robots, Grasping
Abstract: Picking and Placing objects by aerial robot in the fields is an important and challenging task, which can significantly benefit not only the industry but also the rescue. General strategy depends on the magnetic force to pick object, which however lacks both generality and robustness. Therefore, we focus on the articulated structure to grasp bricking. Another issue to perform pick-and-place task in the fields is the autonomous recognition using onboard sensors. In this article, we present the achievement of fully autonomous pick-and-place by articulated aerial robot in a fully-autonomous manner. First, an articulated robot model with actively tiltable sensor is developed to guarantee the robustness in both state estimation and object detection. Second object detection methods are designed according to distance between robot and target object. Third, a comprehensive motion strategy is also developed to perform autonomous object searching, picking, and placing sequence. Particularly, a visual servoing method for robot position control is also proposed in this motion strategy to improve the robustness while approaching target. Finally, we present the experimental results of autonomous
|
| |
| MoAIP-12 Regular session, Hall E |
Add to My Program |
| Clone of 'Perception for Grasping and Manipulation I' |
|
| |
| |
| 10:00-11:30, Paper MoAIP-12.1 | Add to My Program |
| I2c-Net: Using Instance-Level Neural Networks for Monocular Category-Level 6D Pose Estimation |
|
| Remus, Alberto | Sant'Anna School of Advanced Studies |
| D'Avella, Salvatore | Scuola Superiore Sant'Anna |
| Di Felice, Francesco | Mechanical Intelligence Institute, Sant'Anna School of Advanced |
| Tripicchio, Paolo | Scuola Superiore Sant'Anna |
| Avizzano, Carlo Alberto | Scuola Superiore Sant'Anna |
Keywords: Perception for Grasping and Manipulation, Deep Learning for Visual Perception, RGB-D Perception
Abstract: Object detection and pose estimation are strict requirements for many robotic grasping and manipulation applications to endow robots with the ability to grasp objects with different properties in cluttered scenes and with various lighting conditions. This work proposes the framework i2c-net to extract the 6D pose of multiple objects belonging to different categories, starting from an instance-level pose estimation network and relying only on RGB images. The network is trained on a custom-made synthetic photo-realistic dataset, generated from some base CAD models, opportunely deformed and enriched with real textures for domain randomization purposes. At inference time, the instance-level network is employed in combination with a 3D mesh reconstruction module, achieving category-level capabilities. Depth information is used for postprocessing as correction. Tests conducted on real objects of the YCB-V and NOCS REAL datasets outline the high accuracy of the proposed approach.
|
| |
| 10:00-11:30, Paper MoAIP-12.2 | Add to My Program |
| Self-Supervised Instance Segmentation by Grasping |
|
| Liu, YuXuan | Covariant.ai, UC Berkeley |
| Chen, Xi | Embodied Intelligence, UC Berkeley |
| Abbeel, Pieter | UC Berkeley |
Keywords: Object Detection, Segmentation and Categorization, Deep Learning for Visual Perception, Perception for Grasping and Manipulation
Abstract: Instance segmentation is a fundamental skill for many robotic applications. We propose a self-supervised method that uses grasp interactions to collect segmentation supervision for an instance segmentation model. When a robot grasps an item, the mask of that grasped item can be inferred from the images of the scene before and after the grasp. Leveraging this insight, we learn a grasp segmentation model from a small dataset of labelled images to segment the grasped object from before and after grasp images. Such a model can segment grasped objects from thousands of grasp interactions without costly human annotation. Using the segmented grasped objects, we can "cut" objects from their original scenes and "paste" them into new scenes to generate instance supervision. We show that our grasp segmentation model provides a 5x error reduction when segmenting grasped objects compared with traditional image subtraction approaches. Combined with our "cut-and-paste" generation method, instance segmentation models trained with our method achieve better performance than a model trained with 10x the amount of labelled data. On a real robotic grasping system, our instance segmentation model reduces the rate of grasp errors by over 3x compared to an image subtraction baseline.
|
| |
| 10:00-11:30, Paper MoAIP-12.3 | Add to My Program |
| Fusing Visual Appearance and Geometry for Multi-Modality 6DoF Object Tracking |
|
| Stoiber, Manuel | German Aerospace Center (DLR) |
| Elsayed, Mariam | Technical University Munich |
| Reichert, Anne Elisabeth | German Aerospace Center |
| Steidle, Florian | German Aerospace Center |
| Lee, Dongheui | Technische Universit�t Wien (TU Wien) |
| Triebel, Rudolph | German Aerospace Center (DLR) |
Keywords: Visual Tracking, Perception for Grasping and Manipulation, RGB-D Perception
Abstract: In many applications of advanced robotic manipulation, six degrees of freedom (6DoF) object pose estimates are continuously required. In this work, we develop a multi-modality tracker that fuses information from visual appearance and geometry to estimate object poses. The algorithm extends our previous method ICG, which uses geometry, to additionally consider surface appearance. In general, object surfaces contain local characteristics from text, graphics, and patterns, as well as global differences from distinct materials and colors. To incorporate this visual information, two modalities are developed. For local characteristics, keypoint features are used to minimize distances between points from keyframes and the current image. For global differences, a novel region approach is developed that considers multiple regions on the object surface. In addition, it allows the modeling of external geometries. Experiments on the YCB-Video and OPT datasets demonstrate that our approach ICG+ performs best on both datasets, outperforming both conventional and deep learning-based methods. At the same time, the algorithm is highly efficient and runs at more than 300 Hz. The source code of our tracker is publicly available.
|
| |
| 10:00-11:30, Paper MoAIP-12.4 | Add to My Program |
| Viewpoint Push Planning for Mapping of Unknown Confined Spaces |
|
| Dengler, Nils | University of Bonn |
| Pan, Sicong | University of Bonn |
| Kalagaturu, Vamsi Krishna | Hochschule Bonn-Rhein-Sieg |
| Menon, Rohit | University of Bonn |
| Elnagdi, Murad | University of Bonn |
| Bennewitz, Maren | University of Bonn |
Keywords: Perception for Grasping and Manipulation
Abstract: Viewpoint planning is an important task in any application where objects or scenes need to be viewed from different angles to achieve sufficient coverage. The mapping of confined spaces such as shelves is an especially challenging task since objects occlude each other and the scene can only be observed from the front, posing limitations on the possible viewpoints. In this paper, we propose a deep reinforcement learning framework that generates promising views aiming at reducing the map entropy. Additionally, the pipeline extends standard viewpoint planning by predicting adequate minimally invasive push actions to uncover occluded objects and increase the visible space. Using a 2.5D occupancy height map as state representation that can be efficiently updated, our system decides whether to plan a new viewpoint or perform a push. To learn feasible pushes, we use a neural network to sample push candidates on the map based on training data provided by human experts. As simulated and real-world experimental results with a robotic arm show, our system is able to significantly increase the mapped space compared to different baselines, while the executed push actions highly benefit the viewpoint planner with only minor changes to the object configuration.
|
| |
| 10:00-11:30, Paper MoAIP-12.5 | Add to My Program |
| Depth-Based 6DoF Object Pose Estimation Using Swin Transformer |
|
| Li, Zhujun | The City University of New York |
| Stamos, Ioannis | City University of New York |
Keywords: Perception for Grasping and Manipulation, Deep Learning Methods, Object Detection, Segmentation and Categorization
Abstract: Accurately estimating the 6D pose of objects is crucial for many applications, such as robotic grasping, autonomous driving, and augmented reality. However, this task becomes more challenging in poor lighting conditions or when dealing with textureless objects. To address this issue, depth images are becoming an increasingly popular choice due to their invariance to a scene's appearance and the implicit incorporation of essential geometric characteristics. However, fully leveraging depth information to improve the performance of pose estimation remains a difficult and under-investigated problem. To tackle this challenge, we propose a novel framework called SwinDePose, that uses only geometric information from depth images to achieve accurate 6D pose estimation. SwinDePose first calculates the angles between each normal vector defined in a depth image and the three coordinate axes in the camera coordinate system. The resulting angles are then formed into an image, which is encoded using Swin Transformer. Additionally, we apply RandLA-Net to learn the representations from point clouds. The resulting image and point clouds embeddings are concatenated and fed into a semantic segmentation module and a 3D keypoints localization module. Finally, we estimate 6D poses using a least-square fitting approach based on the target object's predicted semantic mask and 3D keypoints. In experiments on the LineMod and Occlusion LineMod, SwinDePose outperforms existing state-of-the-art methods for 6D object pose estimation using depth images. We also provide competitive results on the YCB-Video dataset even without post-processing. This demonstrates the effectiveness of our approach and highlights its potential for improving performance in real-world scenarios. Our code is at https://github.com/zhujunli1993/SwinDePose.
|
| |
| 10:00-11:30, Paper MoAIP-12.6 | Add to My Program |
| DR-Pose: A Two-Stage Deformation-And-Registration Pipeline for Category-Level 6D Object Pose Estimation |
|
| Zhou, Lei | National University of Singapore |
| Liu, Zhiyang | National University of Singapore |
| Gan, Runze | National University of Singapore |
| Wang, Haozhe | National University of Singapore |
| Ang Jr, Marcelo H | National University of Singapore |
Keywords: Perception for Grasping and Manipulation, Deep Learning for Visual Perception
Abstract: Category-level object pose estimation involves estimating the 6D pose and the 3D metric size of objects from predetermined categories. While recent approaches take categorical shape prior information as reference to improve pose estimation accuracy, the single-stage network design and training manner lead to sub-optimal performance since there are two distinct tasks in the pipeline. In this paper, the advantage of two- stage pipeline over single-stage design is discussed. To this end, we propose a two-stage deformation-and-registration pipeline called DR-Pose, which consists of completion-aided deformation stage and scaled registration stage. The first stage uses a point cloud completion method to generate unseen parts of target object, guiding subsequent deformation on the shape prior. In the second stage, a novel registration network is designed to extract pose-sensitive features and predict the representation of object partial point cloud in canonical space based on the deformation results from the first stage. DR-Pose produces superior results to the state-of-the-art shape prior-based methods on both CAMERA25 and REAL275 benchmarks. Codes are available at https://github.com/Zray26/DR-Pose.git.
|
| |
| 10:00-11:30, Paper MoAIP-12.7 | Add to My Program |
| Learning from Pixels with Expert Observations |
|
| Hoang, Minh-Huy | University of Science, Ho Chi Minh City, Vietnam |
| Dinh, Long | Hanoi University of Science & Technology |
| Hai, Nguyen | Northeastern University |
Keywords: Reinforcement Learning, Learning from Demonstration, Deep Learning in Grasping and Manipulation
Abstract: In reinforcement learning (RL), sparse rewards can present a significant challenge. Fortunately, expert actions can be utilized to overcome this issue. However, acquiring explicit expert actions can be costly, and expert observations are often more readily available. This paper presents a new approach that uses expert observations for learning in robot manipulation tasks with sparse rewards from pixel observations. Specifically, our technique involves using expert observations as intermediate visual goals for a goal-conditioned RL agent, enabling it to complete a task by successively reaching a series of goals. We demonstrate the efficacy of our method in five challenging block construction tasks in simulation and show that when combined with two state-of-the-art agents, our approach can significantly improve their performance while requiring 4-20 times fewer expert actions during training. Moreover, our method is also superior to a hierarchical baseline.
|
| |
| 10:00-11:30, Paper MoAIP-12.8 | Add to My Program |
| RMBench: Benchmarking Deep Reinforcement Learning for Robotic Manipulator Control |
|
| Xiang, Yanfei | Tsinghua University |
| Wang, Xin | University at Buffalo |
| Hu, Shu | Carnegie Mellon University |
| Zhu, Bin Benjamin | Microsoft Research Asia |
| Huang, Xiaomeng | Tsinghua University |
| Wu, Xi | Chengdu University of Information Technology |
| Lyu, Siwei | University at Buffalo |
Keywords: Reinforcement Learning, Performance Evaluation and Benchmarking
Abstract: Reinforcement learning is used to tackle complex tasks with high-dimensional sensory inputs. Over the past decade, a wide range of reinforcement learning algorithms have been developed, with recent progress benefiting from deep learning for raw sensory signal representation. This raises a natural question: how well do these algorithms perform across different robotic manipulation tasks? To objectively compare algorithms, benchmarks use performance metrics. Benchmarks use objective performance metrics to offer a scientific way to compare algorithms. In this paper, we introduce RMBench, the first benchmark for robotic manipulations with high-dimensional continuous action and state spaces. We implement and evaluate reinforcement learning algorithms that take observed pixels as inputs and report their average performance and learning curves to demonstrate their performance and training stability. Our study concludes that none of the evaluated algorithms can handle all tasks well, with soft Actor-Critic outperforming most algorithms in terms of average reward and stability, and an algorithm combined with data augmentation potentially facilitating learning policies. Our code is publicly available at https://github.com/xiangyanfei212/RMBench-2022.git, including all benchmark tasks and studied algorithms.
|
| |
| 10:00-11:30, Paper MoAIP-12.9 | Add to My Program |
| Shape Completion with Prediction of Uncertain Regions |
|
| Humt, Matthias | German Aerospace Center (DLR), Technical University Munich (TUM) |
| Winkelbauer, Dominik | DLR |
| Hillenbrand, Ulrich | German Aerospace Center (DLR) |
Keywords: Perception for Grasping and Manipulation, RGB-D Perception
Abstract: Shape completion, i.e., predicting the complete geometry of an object from a partial observation, is highly relevant for several downstream tasks, most notably robotic manipulation. When basing planning or prediction of real grasps on object shape reconstruction, an indication of severe geometric uncertainty is indispensable. In particular, there can be an irreducible uncertainty in extended regions about the presence of entire object parts when given ambiguous object views. To treat this important case, we propose two novel methods for predicting such uncertain regions as straightforward extensions of any method for predicting local spatial occupancy, one through postprocessing occupancy scores, the other through direct prediction of an uncertainty indicator. We compare these methods together with two known approaches to probabilistic shape completion. Moreover, we generate a dataset, derived from ShapeNet [1], of realistically rendered depth images of object views with ground-truth annotations for the uncertain regions. We train on this dataset and test each method in shape completion and prediction of uncertain regions for known and novel object instances and on synthetic and real data. While direct uncertainty prediction is by far the most accurate in the segmentation of uncertain regions, both novel methods outperform the two baselines in shape completion and uncertain region prediction, and avoiding the predicted uncertain regions increases the quality of grasps for all tested methods. Web: https://github.com/DLR-RM/shape-completion
|
| |
| 10:00-11:30, Paper MoAIP-12.10 | Add to My Program |
| Structure from Action: Learning Interactions for 3D Articulated Object Structure Discovery |
|
| Nie, Neil | Columbia University |
| Gadre, Samir Yitzhak | Columbia University |
| Ehsani, Kiana | Allen Institute for Artificial Intelligence |
| Song, Shuran | Columbia University |
Keywords: Object Detection, Segmentation and Categorization, Perception-Action Coupling, Deep Learning for Visual Perception
Abstract: We introduce Structure from Action (SfA), a framework to discover 3D part geometry and joint parameters of unseen articulated objects via a sequence of inferred interactions. Our key insight is that 3D interaction and perception should be considered in conjunction to construct 3D articulated CAD models, especially for categories not seen during training. By selecting informative interactions, SfA discovers parts and reveals occluded surfaces, like the inside of a closed drawer. By aggregating visual observations in 3D, SfA accurately segments multiple parts, reconstructs part geometry, and infers all joint parameters in a canonical coordinate frame. Our experiments demonstrate that a SfA model trained in simulation can generalize to many unseen object categories with diverse structures and to real-world objects. Empirically, SfA outperforms a pipeline of state-of-the-art components by 25.4 3D IoU percentage points on unseen categories, while matching already performant joint estimation baselines.
|
| |
| 10:00-11:30, Paper MoAIP-12.11 | Add to My Program |
| Object-Oriented Option Framework for Robotics Manipulation in Clutter |
|
| Pang, Jing-Cheng | Nanjing University |
| Young, Stalin | Nanjing University |
| Xiong-Hui, Chen | National Key Laboratory for Novel Software Technology, Nanjing U |
| Yang, Xinyu | Nanjing University |
| Yang, Yu | National Key Laboratory for Novel Software Technology, Nanjing U |
| Mas, Ma | CloudMinds Robotics |
| Ziqi, Guo | CloudMinds Robotics |
| Yang, Howard | CloundMinds |
| Huang, Bill | CloudMinds Technologies Inc |
Keywords: Reinforcement Learning, Deep Learning in Grasping and Manipulation
Abstract: Domestic service robots are becoming increasingly popular due to their ability to help people with household tasks. These robots often encounter the challenge of manipulating objects in cluttered environments (MoC), which is difficult due to the complexity of effective planning and control. Previous solutions involved designing specific action primitives and planning paradigms. However, the pre-coded action primitives can limit the agility and task-solving scope of robots. In this paper, we propose a general approach for MoC called the Object-Oriented Option Framework (O3F), which uses the option framework (OF) to learn planning and control. The standard OF discovers options from scratch based on reinforcement learning, which can lead to collapsed options and hurt learning. To address this limitation, O3F introduces the concept of an object-oriented option space for OF, which focuses specifically on object movement and overcomes the challenges associated with collapsed options. Based on this, we train an object-oriented option planner to determine the option to execute and a universal object-oriented option executor to complete the option. Simulation experiments on the Ginger XR1 robot and robot arm show that O3F is generally applicable to various types of robot and manipulation tasks. Furthermore, O3F achieves success rates of 72.4% and 90% in grasping and object collecting tasks, respectively, significantly outperforming baseline methods.
|
| |
| 10:00-11:30, Paper MoAIP-12.12 | Add to My Program |
| Non-Contact Tactile Perception for Hybrid-Active Gripper |
|
| Pereira, Jonathas Henrique Mariano | IFSP - Institute Technology of Sao Paulo, Campus Registro |
| Joventino, Carlos Fernando | IFSP - Institute Technology of Sao Paulo, Campus Registro |
| Fabro, Jo�o Alberto | Federal University of Technology - Parana (UTFPR) |
| de Oliveira, Andre Schneider | Federal University of Technology - Parana |
Keywords: Object Detection, Segmentation and Categorization, Perception for Grasping and Manipulation, Manipulation Planning
Abstract: This paper presents a novel approach to object recognition using a reconfigurable gripper with multiple time-of-flight (ToF) sensors attached to the fingers and palm, introducing the concept of noncontact tactile perception. This approach aims to promote aproprioceptive sense in the gripper workspace, allowing object prediction in manipulation tasks. The Hybrid-Active (H-A) gripper can adapt its topology to achieve different object reading points to generate a reliable object estimation. Non-contact tactile perception uses ToF sensors and gripper reconfiguration degrees-of-freedom for 3D perception and surface estimation of the pick-up object. This method is based on five ToF sensors in the palm that identify the distance and adjust the gripper to the center of the object through its capability to manage the manipulator. The H-A gripper also has twelve sensors distributed over its three fingers: four sensors on each finger, two on the distal phalanx, and two on the middle phalanx. Fingers have a rotational mobility of 180�, allowing the sensing of all faces of the object at different angles to the tridimensional reconstruction. The proposed approach was evaluated in four experiments that analyzed the influence of resolution, object complexity, finger tilt, and angular sampling over 13 objects with different complexities. The experimentation set allows the overall evaluation of non-contact tactile perception and the specification of its performance parameters.
|
| |
| MoAIP-13 Regular session, Hall E |
Add to My Program |
| Clone of 'Visual Learning' |
|
| |
| |
| 10:00-11:30, Paper MoAIP-13.1 | Add to My Program |
| ILabel: Revealing Objects in Neural Fields |
|
| Zhi, Shuaifeng | National University of Defense Technology |
| Sucar, Edgar | Imperial College London |
| Mouton, Andre | Dyson Ltd |
| Haughton, Iain | Dyson Ltd |
| Laidlow, Tristan | Boston Dynamics |
| Davison, Andrew J | Imperial College London |
Keywords: Semantic Scene Understanding, Deep Learning for Visual Perception, Representation Learning
Abstract: A neural field trained with self-supervision to efficiently represent the geometry and colour of a 3D scene tends to automatically decompose it into coherent and accurate object-like regions, which can be revealed with sparse labelling interactions to produce a 3D semantic scene segmentation. Our real-time iLabel system takes input from a hand-held RGB-D camera, requires zero prior training data, and works in an `open set' manner, with semantic classes defined on the fly by the user. iLabel's underlying model is a simple multilayer perceptron (MLP), trained from scratch to learn a neural representation of a single 3D scene. The model is updated continually and visualised in real-time, allowing the user to focus interactions to achieve extremely efficient semantic segmentation. A room-scale scene can be accurately labelled into 10+ semantic categories with around 100 clicks, taking less than 5 minutes. Quantitative labelling accuracy scales powerfully with the number of clicks, and rapidly surpasses standard pre-trained semantic segmentation methods. We also demonstrate a hierarchical labelling variant of iLabel and a `hands-free' mode where the user only needs to supply label names for automatically-generated locations.
|
| |
| 10:00-11:30, Paper MoAIP-13.2 | Add to My Program |
| Weakly Supervised Referring Expression Grounding via Dynamic Self-Knowledge Distillation |
|
| Mi, Jinpeng | USST |
| Chen, Zhiqian | University of Shanghai for Science and Technology |
| Zhang, Jianwei | University of Hamburg |
Keywords: Visual Learning, Deep Learning for Visual Perception
Abstract: Weakly supervised referring expression grounding (WREG) is an attractive and challenging task for grounding target regions in images by understanding given referring expressions. WREG learns to ground target objects without the manual annotations between image regions and referring expressions during the model training phase. Different from the predominant grounding pattern of existing models, which locates target objects by reconstructing the region-expression correspondence, we investigate WREG from a novel perspective and enrich the prevailing pattern with self-knowledge distillation. Specifically, we propose a target-guided self-knowledge distillation approach that adopts the target prediction knowledge learned from the previous training iterations as the teacher to guide the subsequent training procedure. In order to avoid the misleading caused by the teacher knowledge with low prediction confidence, we present an uncertainty-aware knowledge refinement strategy to adaptively rectify the teacher knowledge by learning dynamic threshold values based on the model prediction uncertainty. To validate the proposed approach, we implement extensive experiments on three benchmark datasets, i.e., RefCOCO, RefCOCO+, and RefCOCOg. Our approach achieves new state-of-the-art results on several splits of the benchmark datasets, showcasing the advantage of the proposed framework for WREG. The implementation codes and trained models are available at: https://github.com/dami23/WREG_Self_KD.
|
| |
| 10:00-11:30, Paper MoAIP-13.3 | Add to My Program |
| EventTransAct: A Video Transformer-Based Framework for Event-Camera Based Action Recognition |
|
| de Blegiers, Tristan | University of Central Florida |
| Dave, Ishan Rajendrakumar | University of Central Florida |
| Yousaf, Adeel | University of Central Florida |
| Shah, Mubarak | University of Central Florida |
Keywords: Gesture, Posture and Facial Expressions, Visual Learning, Computer Vision for Automation
Abstract: Recognizing and comprehending human actions and gestures is a crucial perception requirement for robots to interact with humans and carry out tasks in diverse domains, including service robotics, healthcare, and manufacturing. Event cameras, with their ability to capture fast-moving objects at a high temporal resolution, offer new opportunities compared to standard action recognition in RGB videos. However, previous research on event camera action recognition has primarily focused on sensor-specific network architectures and image encoding, which may not be suitable for new sensors and limit the use of recent advancement in transformer-based architectures. In this study, we employ using a computationally efficient model, namely the video transformer network (VTN), which initially acquires spatial embeddings per event-frame and then utilizes a temporal self-attention mechanism. This approach separates the spatial and temporal operations, resulting in VTN being more computationally efficient than other video transformers that process spatio-temporal volumes directly. In order to better adopt the VTN for the sparse and finegrained nature of event data, we design Event-Contrastive Loss (mathcal{L}_{EC}) and event specific augmentations. Proposed mathcal{L}_{EC} promotes learning fine-grained spatial cues in the spatial backbone of VTN by contrasting temporally misaligned frames. We evaluate our method on real-world action recognition of N-EPIC Kitchens dataset, and achieve state-of-the-art results on both protocols - testing in seen kitchen (textbf{74.9%} accuracy) and testing in unseen kitchens (textbf{42.43% and 46.66% Accuracy}). Our approach also takes less computation time compared to competitive prior approaches. We also evaluate our method on the standard DVS Gesture recognition dataset, achieving a competitive accuracy of textbf{97.9%} compared to prior work that uses dedicated architectures and image-encoding for the DVS dataset. These results demonstrate the potential of our framework textit{EventTransAct} for real-world applications of event-camera based action recognition. Project Page: url{https://tristandb8.github.io/EventTransAct_webpage/}
|
| |
| 10:00-11:30, Paper MoAIP-13.4 | Add to My Program |
| Virtual Ski Training System That Allows Beginners to Acquire Ski Skills Based on Physical and Visual Feedbacks |
|
| Okada, Yushi | Waseda University |
| Seo, Chanjin | Waseda University |
| Miyakawa, Shunichi | Waseda University |
| Taniguchi, Motofumi | Waseda University |
| Kanosue, Kazuyuki | Waseda University |
| Ogata, Hiroyuki | Seikei University |
| Ohya, Jun | Waseda University |
Keywords: Virtual Reality and Interfaces, Visual Learning, Sensorimotor Learning
Abstract: This paper proposes a ski training system using VR (Virtual Reality) that enables beginners to acquire skiing skills without going to an actual ski ground. The proposed system obtains the speed of skiing based on the center of pressure (COP) of each player's foot. The first-person perspective of skiing at the obtained speed down a ski slope is fed back to the player as a VR image. Experiments were conducted to evaluate the effectiveness of the proposed system and the VR interface. Specifically, beginner skiers were categorized into three groups: "a group trained with the proposed VR system", "a group trained with a system that provides feedback of the skiing speed calculated from the COP by increasing or decreasing the gauge (a bar-shaped graph representing changes in numerical values), instead of VR", and "a group that does not train with the system". After training under each of these conditions, a sliding test was conducted on an actual ski slope to check the degree of skill acquisition. The results show that subjects trained with the proposed system acquired more skiing skills than subjects who did not use the system on actual ski slopes. Furthermore, there was no clear difference in the result of the sliding test between subjects trained by the VR interface and those trained by the gauge interface, but the VR interface yields better deceleration postures.
|
| |
| 10:00-11:30, Paper MoAIP-13.5 | Add to My Program |
| Attention-Based VR Facial Animation with Visual Mouth Camera Guidance for Immersive Telepresence Avatars |
|
| Rochow, Andre | University of Bonn |
| Schwarz, Max | University Bonn |
| Behnke, Sven | University of Bonn |
Keywords: Gesture, Posture and Facial Expressions, Visual Learning, Human-Robot Collaboration
Abstract: Facial animation in virtual reality environments is essential for applications that necessitate clear visibility of the user�s face and the ability to convey emotional signals. In our scenario, we animate the face of an operator who controls a robotic Avatar system. The use of facial animation is particularly valuable when the perception of interacting with a specific individual, rather than just a robot, is intended. Purely keypoint-driven animation approaches struggle with the complexity of facial movements. We present a hybrid method that uses both keypoints and direct visual guidance from a mouth camera. Our method generalizes to unseen operators and requires only a quick enrolment step with capture of two short videos. Multiple source images are selected with the intention to cover different facial expressions. Given a mouth camera frame from the HMD, we dynamically construct the target keypoints and apply an attention mechanism to determine the importance of each source image. To resolve keypoint ambiguities and animate a broader range of mouth expressions, we propose to inject visual mouth camera information into the latent space. We enable training on large-scale speaking head datasets by simulating the mouth camera input with its perspective differences and facial deformations. Our method outperforms a baseline in quality, capability, and temporal consistency. In addition, we highlight how the facial animation contributed to our victory at the ANA Avatar XPRIZE Finals.
|
| |
| 10:00-11:30, Paper MoAIP-13.6 | Add to My Program |
| Test-Time Adaptation for Point Cloud Upsampling Using Meta-Learning |
|
| Hatem, Ahmed | University of Manitoba |
| Qian, Yiming | University of Manitoba |
| Wang, Yang | Concordia University |
Keywords: Visual Learning, Deep Learning Methods, Transfer Learning
Abstract: Affordable 3D scanners often produce sparse and non-uniform point clouds that negatively impact downstream applications in robotic systems. While existing point cloud upsampling architectures have demonstrated promising results on standard benchmarks, they tend to experience significant performance drops when the test data have different distributions from the training data. To address this issue, this paper proposes a test-time adaption approach to enhance model generality of point cloud upsampling. The proposed approach leverages meta-learning to explicitly learn network parameters for test-time adaption. Our method does not require any prior information about the test data. During meta-training, the model parameters are learned from a collection of instance-level tasks, each of which consists of a sparse-dense pair of point clouds from the training data. During meta-testing, the trained model is fine-tuned with a few gradient updates to produce a unique set of network parameters for each test instance. The updated model is then used for the final prediction. Our framework is generic and can be applied in a plug-and-play manner with existing backbone networks in point cloud upsampling. Extensive experiments demonstrate that our approach improves the performance of state-of-the-art models.
|
| |
| 10:00-11:30, Paper MoAIP-13.7 | Add to My Program |
| Revisiting Event-Based Video Frame Interpolation |
|
| Chen, Jiaben | University of California, San Diego |
| Zhu, Yichen | Shanghaitech University |
| Lian, Dongze | National University of Singapore |
| Yang, Jiaqi | ShanghaiTech University |
| Wang, Yifu | ShanghaiTech University |
| Zhang, Renrui | Peking University |
| Liu, Xinhang | HKUST |
| Qian, Shenhan | Technical University of Munich |
| Kneip, Laurent | ShanghaiTech University |
| Gao, Shenghua | Shanghaitech University |
Keywords: Visual Learning, Sensor Fusion, Deep Learning for Visual Perception
Abstract: Dynamic vision sensors or event cameras provide rich complementary information for video frame interpolation. Existing state-of-the-art methods follow the paradigm of combining both synthesis-based and warping networks. However, few of those methods fully respect the intrinsic characteristics of events streams. Given that event cameras only encode intensity changes and polarity rather than color intensities, estimating optical flow from events is arguably more difficult than from RGB information. We therefore propose to incorporate RGB information in an event-guided optical flow refinement strategy. Moreover, in light of the quasi-continuous nature of the time signals provided by event cameras, we propose a divide-and-conquer strategy in which event-based intermediate frame synthesis happens incrementally in multiple simplified stages rather than in a single, long stage. Extensive experiments on both synthetic and real-world datasets show that these modifications lead to more reliable and realistic intermediate frame results than previous video frame interpolation methods. Our findings underline that a careful consideration of event characteristics such as high temporal density and elevated noise benefits interpolation accuracy.
|
| |
| 10:00-11:30, Paper MoAIP-13.8 | Add to My Program |
| Revisiting Deformable Convolution for Depth Completion |
|
| Sun, Xinglong | Stanford & UIUC |
| Ponce, Jean | Ecole Normale Sup�rieure |
| Wang, Yu-Xiong | University of Illinois Urbana-Champaign |
Keywords: RGB-D Perception, Visual Learning
Abstract: Depth completion, which aims to generate high-quality dense depth maps from sparse depth maps, has attracted increasing attention in recent years. Previous work usually employs RGB images as guidance, and introduces iterative spatial propagation to refine estimated coarse depth maps. However, most of the propagation refinement methods require several iterations and suffer from a fixed receptive field, which may contain irrelevant and useless information with very sparse input. In this paper, we address these two challenges simultaneously by revisiting the idea of deformable convolution. We propose an effective architecture that leverages deformable kernel convolution as a single-pass refinement module, and empirically demonstrate its superiority. To better understand the function of deformable convolution and exploit it for depth completion, we further systematically investigate a variety of representative strategies. Our study reveals that, different from prior work, deformable convolution needs to be applied on an estimated depth map with a relatively high density for better performance. We evaluate our model on the large-scale KITTI dataset and achieve state-of-the-art level performance in both accuracy and inference speed. Our code is available at https://github.com/AlexSunNik/ReDC.
|
| |
| 10:00-11:30, Paper MoAIP-13.9 | Add to My Program |
| Long-Distance Gesture Recognition Using Dynamic Neural Networks |
|
| Bhatnagar, Shubhang | University of Illinois at Urbana-Champaign |
| Gopal, Sharath | Bosch |
| Ahuja, Narendra | Univ. of Illinois |
| Ren, Liu | Robert Bosch North America Research Technology Center |
Keywords: Gesture, Posture and Facial Expressions, Visual Learning, Recognition
Abstract: Gestures form an important medium of communication between humans and machines. An overwhelming majority of existing gesture recognition methods are tailored to a scenario where humans and machines are located very close to each other. This short-distance assumption does not hold true for several types of interactions, for example gesture-based interactions with a floor cleaning robot or with a drone. Methods made for short-distance recognition are unable to perform well on long-distance recognition due to gestures occupying only a small portion of the input data. Their performance is especially worse in resource constrained settings where they are not able to effectively focus their limited compute on the gesturing subject. We propose a novel, accurate and efficient method for the recognition of gestures from longer distances. It uses a dynamic neural network to select features from gesture-containing spatial regions of the input sensor data for further processing. This helps the network focus on features important for gesture recognition while discarding background features early on, thus making it more compute efficient compared to other techniques. We demonstrate the performance of our method on the LD-ConGR long-distance dataset where it outperforms previous state-of-the-art methods on recognition accuracy and compute efficiency.
|
| |
| 10:00-11:30, Paper MoAIP-13.10 | Add to My Program |
| Neural Implicit Vision-Language Feature Fields |
|
| Blomqvist, Kenneth | ETH Zurich |
| Milano, Francesco | ETH Zurich |
| Chung, Jen Jen | The University of Queensland |
| Ott, Lionel | ETH Zurich |
| Siegwart, Roland | ETH Zurich |
Keywords: Semantic Scene Understanding, Visual Learning, Representation Learning
Abstract: Recently, groundbreaking results have been presented on open-vocabulary semantic image segmentation. Such methods segment each pixel in an image into arbitrary categories provided at run-time in the form of text prompts, as opposed to a fixed set of classes defined at training time. In this work, we present a method for volumetric open-vocabulary semantic scene segmentation. Our method builds on the insight that we can fuse 2D image features from a vision-language model into a neural implicit representation. We show that the resulting feature field can be segmented into different classes by assigning points to the closest natural language text prompt. Using an implicit volumetric representation enables us to segment the scene both in 3D and 2D by rendering feature maps from any given viewpoint of the scene. We show that our method works on noisy real-world data and can run in real-time on live sensor data dynamically adjusting to text prompts. We also present quantitative comparisons on the diverse ScanNet dataset.
|
| |
| 10:00-11:30, Paper MoAIP-13.11 | Add to My Program |
| Language Guided Robotic Grasping with Fine-Grained Instructions |
|
| Sun, Qiang | Fudan University |
| Lin, Haitao | Fudan University |
| Fu, Ying | Beijing Institute of Technology |
| Fu, Yanwei | Fudan University |
| Xue, Xiangyang | Fudan University |
Keywords: Visual Learning, Semantic Scene Understanding, Grasping
Abstract: Given a single RGB image and the attribute-rich language instructions, this paper investigates the novel problem of using Fine-grained instructions for the Language guided robotic Grasping FLarG. This problem is made challenging by learning fine-grained language descriptions to ground target objects. Recent advances have been made in visually grounding the objects simply by several coarse attributes. However, these methods have poor performance as they cannot well align the multi-modal features, and do not make the best of recent powerful large pre-trained vision and language models, e.g., Clip. To this end, this paper proposes a FLarG pipeline including stages of clip-guided object localization, and 6-DoF category-level object pose estimation for grasping. Specially, we first take the Clip-based segmentation model CRIS as the backbone and propose an end-to-end DyCRIS model that uses a novel dynamic mask strategy to well fuse the multi-level language and vision features. Then, the well-trained instance segmentation backbone Mask R-CNN is adopted to further improve the predicted mask of our DyCRIS. Finally, the target object pose is inferred for the robotics grasping by using the recent 6-DoF object pose estimation method. To validate our CLIP-enhanced pipeline, we also construct a validation dataset for our FLarG task and name it RefNOCS. Extensive results on RefNOCS have shown the utility and effectiveness of our proposed method. The project homepage is available at https://sunqiang85.github.io/FLarG.
|
| |
| 10:00-11:30, Paper MoAIP-13.12 | Add to My Program |
| Whole Shape Estimation of Transparent Object from Its Contour Using Statistical Shape Model |
|
| Okada, Kaihei | Kanazawa University |
| Kobayashi, Riku | Kanazawa University |
| Tsuji, Tokuo | Kanazawa University |
| Hiramitsu, Tatsuhiro | Kanazawa University |
| Seki, Hiroaki | Kanazawa University |
| Nishimura, Toshihiro | Kanazawa University |
| Suzuki, Yosuke | Kanazawa University |
| Watanabe, Tetsuyou | Kanazawa University |
Keywords: Computer Vision for Automation, Computer Vision for Manufacturing
Abstract: This paper presents a method for estimating the 3D shape of transparent objects from an RGB-D image using a statistical shape model. Statistical shape models compress dimensions from multiple shapes to represent variations in shape with fewer parameters. It is difficult to measure the depth of a transparent object with any sensor. Therefore, the statistical shape model is deformed to fit the contour extracted from the RGB image to estimate the shape of the object. The depth image is only used for detecting the plane on which transparent objects are placed. The proposed method estimates the whole shape of transparent objects, unlike other estimation methods. The estimation accuracy of the proposed method is compared with that of a machine learning based method. In addition, the estimated whole shape was compared with the measured data of a 3D scanner.
|
| |
| MoAIP-14 Regular session, Hall E |
Add to My Program |
| Clone of 'Localization I' |
|
| |
| |
| 10:00-11:30, Paper MoAIP-14.1 | Add to My Program |
| A Hierarchical Multi-Task Visual Relocalization System |
|
| Yin, Jiahao | Beihang University |
| Xiao, Huahui | BUAA |
| Li, Wei | Beihang University |
| Zhou, Xinyu | University of International Business and Economics |
| Liu, Zhili | Yihang Intellitech Co., Ltd |
| Li, Xue | Yihang Intellitech Co., Ltd |
| Fan, Shengyin | Yihang Intellitech Co., Ltd |
Keywords: Localization, SLAM, Autonomous Vehicle Navigation
Abstract: Locating the 6DoF pose of a camera in a known scene graph is a fundamental problem of SLAM. Hierarchical relocalization methods, which retrieve images first and match feature points later, have been widely studied by scholars for their high accuracy. In this paper, based on hierarchical relocalization, HAPOR (Hierarchical-features Aligned Projection Optimization for Relocalization), an end-to-end relocalization system, is proposed to combine image retrieval and iterative pose optimization. Through an attention mechanism branch, foreground dynamic objects and repeating textures are filtered out. We further design an image retrieval system (GTLGR) in HAPOR and generate an initial pose based on the co-visibility graph for subsequent iterative optimization. In addition, relying on GPS as ground truth for image retrieval training is quite inefficient, thus, we model the common visible area of two camera's view in 3D field, which significantly reduces the training time. Finally, we apply HAPOR to the ORB-SLAM2 system and obtain the state-of-the-art relocalization results. Here is a demo: https://www.youtube.com/watch?v=rCLpWCxN31M
|
| |
| 10:00-11:30, Paper MoAIP-14.2 | Add to My Program |
| RI-LIO: Reflectivity Image Assisted Tightly-Coupled LiDAR-Inertial Odometry |
|
| Zhang, Yanfeng | Institute of Automation, Chinese Academy of Sciences |
| Tian, Yunong | Institute of Automation, Chinese Academy of Sciences |
| Wang, Wanguo | State Grid Intelligence Technology Co., Ltd |
| Yang, Guodong | Institute of Automation, Chinese Academy of Sciences |
| Li, Zhishuo | Chinese Academy of Sciences |
| Jing, Fengshui | Institute of Automation, CAS |
| Tan, Min | Institute of Automation, Chinese Academy of Sciences |
Keywords: Localization, SLAM, Mapping
Abstract: In this letter, we propose RI-LIO, a new reflectivity image assisted tightly-coupled LiDAR-inertial odometry (LIO) framework that introduces additional reflectivity texture information to efficiently reduce the drift of geometric-only methods. To achieve this, we construct an iterated extended Kalman filter framework by blending the point-to-plane geometric measurement and the reflectivity image measurement. Specifically, the geometric measurement is defined as the distance from the raw point of a new scan to its nearest neighbor plane in the global incremental kd-tree map. The searched nearest neighbor point is used to render a sparse reflectivity image after LiDAR motion distortion information is given by its corresponding raw point. Then, the reflectivity measurement is built to align the sparse reflectivity image with the dense reflectivity image of the current scan by minimizing the photometric errors directly. In addition, based on the mechanism of high-resolution LiDAR, a corrected spherical projection model is proposed to project spatial points into the image frame. Finally, extensive experiments are conducted using different mobile robots in structured, unstructured and challenging open field scenarios. The results demonstrate that the proposed method outperforms existing geometric-only methods in terms of robustness and accuracy, especially in the rotation direction.
|
| |
| 10:00-11:30, Paper MoAIP-14.3 | Add to My Program |
| Off the Radar: Uncertainty-Aware Radar Place Recognition with Introspective Querying and Map Maintenance |
|
| Yuan, Jianhao | University of Oxford |
| Newman, Paul | Oxford University |
| Gadd, Matthew | University of Oxford |
Keywords: Localization, Mapping, Deep Learning Methods
Abstract: Localisation with Frequency-Modulated Continuous-Wave (FMCW) radar has gained increasing interest due to its inherent resistance to challenging environments. However, complex artefacts of the radar measurement process require appropriate uncertainty estimation � to ensure the safe and reliable application of this promising sensor modality. In this work, we propose a multi-session map management system which constructs the �best� maps for further localisation based on learned variance properties in an embedding space. Using the same variance properties, we also propose a new way to introspectively reject localisation queries that are likely to be incorrect. For this, we apply robust noise-aware metric learning, which both leverages the short-timescale variability of radar data along a driven path (for data augmentation) and predicts the downstream uncertainty in metric-space-based place recognition. We prove the effectiveness of our method over extensive cross-validated tests of the Oxford Radar RobotCar and MulRan dataset. In this, we outperform the current state-of-the-art in radar place recognition and other uncertainty-aware methods when using only single nearest-neighbour queries. We also show consistent performance increases when rejecting queries based on uncertainty over a difficult test environment, which we did not observe for a competing uncertainty-aware place recognition system.
|
| |
| 10:00-11:30, Paper MoAIP-14.4 | Add to My Program |
| Global Localization in Unstructured Environments Using Semantic Object Maps Built from Various Viewpoints |
|
| Ankenbauer, Jacqueline | Massachusetts Institute of Technology |
| Lusk, Parker C. | Massachusetts Institute of Technology |
| Thomas, Annika | Massachusetts Institute of Technology |
| How, Jonathan | Massachusetts Institute of Technology |
Keywords: Localization, Mapping, SLAM
Abstract: We present a novel framework for global localization and guided relocalization of a vehicle in an unstructured environment. Compared to existing methods, our pipeline does not rely on cues from urban fixtures (e.g., lane markings, buildings), nor does it make assumptions that require the vehicle to be navigating on a road network. Instead, we achieve localization in both urban and non-urban environments by robustly associating and registering the vehicle�s local semantic object map with a compact semantic reference map, potentially built from other viewpoints, time periods, and/or modalities. Robustness to noise, outliers, and missing objects is achieved through our graph-based data association algorithm. Further, the guided relocalization capability of our pipeline mitigates drift inherent in odometry-based localization after the initial global localization. We evaluate our pipeline on two publicly- available, real-world datasets to demonstrate its effectiveness at global localization in both non-urban and urban environments. The Katwijk Beach Planetary Rover dataset [1] is used to show our pipeline�s ability to perform accurate global localization in unstructured environments. Demonstrations on the KITTI dataset [2] achieve an average pose error of 3.8m across all 35 localization events on Sequence 00 when localizing in a reference map created from aerial images. Compared to existing works, our pipeline is more general because it can perform global localization in unstructured environments using maps built from different viewpoints.
|
| |
| 10:00-11:30, Paper MoAIP-14.5 | Add to My Program |
| Constructing Metric-Semantic Maps Using Floor Plan Priors for Long-Term Indoor Localization |
|
| Zimmerman, Nicky | University of Bonn |
| Sodano, Matteo | Photogrammetry and Robotics Lab, University of Bonn |
| Marks, Elias Ariel | University of Bonn |
| Behley, Jens | University of Bonn |
| Stachniss, Cyrill | University of Bonn |
Keywords: Localization, Mapping
Abstract: Object-based maps are relevant for scene understanding since they integrate geometric and semantic information of the environment, allowing autonomous robots to robustly localize and interact with on objects. In this paper, we address the task of constructing a metric-semantic map for the purpose of long-term object-based localization. We exploit 3D object detections from monocular RGB frames for both, the object-based map construction, and for globally localizing in the constructed map. To tailor the approach to a target environment, we propose an efficient way of generating 3D annotations to finetune the 3D object detection model. We evaluate our map construction in an office building, and test our long-term localization approach on challenging sequences recorded in the same environment over nine months. The experiments suggest that our approach is suitable for constructing metric-semantic maps, and that our localization approach is robust to long-term changes. Both, the mapping algorithm and the localization pipeline can run online on an onboard computer. We release an open-source C++/ROS implementation of our approach.
|
| |
| 10:00-11:30, Paper MoAIP-14.6 | Add to My Program |
| DisPlacing Objects: Improving Dynamic Vehicle Detection Via Visual Place Recognition under Adverse Conditions |
|
| Hausler, Stephen | CSIRO |
| Garg, Sourav | Queensland University of Technology |
| Chakravarty, Punarjay | Planet |
| Shrivastava, Shubham | Ford Greenfield Labs |
| Vora, Ankit | Ford Motor Company |
| Milford, Michael J | Queensland University of Technology |
Keywords: Autonomous Vehicle Navigation, Object Detection, Segmentation and Categorization, Localization
Abstract: Can knowing where you are assist in perceiving objects in your surroundings, especially under adverse weather and lighting conditions? In this work we investigate whether a prior map can be leveraged to aid in the detection of dynamic objects in a scene without the need for a 3D map or pixel-level map-query correspondences. We contribute an algorithm which refines an initial set of candidate object detections and produces a refined subset of highly accurate detections using a prior map. We begin by using visual place recognition (VPR) to retrieve a prior map image for a given query image, then use a binary classification neural network that compares the query and prior map image regions to validate the query detection. Once our classification network is trained, on approximately 1000 query-map image pairs, it is able to improve the performance of vehicle detection when combined with an existing off-the-shelf vehicle detector. We demonstrate our approach using standard datasets across two cities (Oxford and Zurich) under different settings of train-test separation of map-query traverse pairs. We further emphasize the performance gains of our approach against alternative design choices and show that VPR suffices for the task, eliminating the need for precise ground truth localization.
|
| |
| 10:00-11:30, Paper MoAIP-14.7 | Add to My Program |
| FM-Loc: Using Foundation Models for Improved Vision-Based Localization |
|
| Mirjalili, Reihaneh | University of Technology Nuremberg |
| Krawez, Michael | University of Technology Nuremberg |
| Burgard, Wolfram | University of Technology Nuremberg |
Keywords: Localization, SLAM, Vision-Based Navigation
Abstract: Visual place recognition is essential for vision-based robot localization and SLAM. Despite the tremendous progress made in recent years, place recognition in changing environments remains challenging. A promising approach to cope with appearance variations is to leverage high-level semantic features like objects or place categories. In this paper, we propose FM-Loc which is a novel image-based localization approach based on Foundation Models that uses the Large Language Model GPT-3 in combination with the Visual-Language Model CLIP to construct a semantic image descriptor that is robust to severe changes in scene geometry and camera viewpoint. We deploy CLIP to detect objects in an image, GPT-3 to suggest potential room labels based on the detected objects, and CLIP again to propose the most likely location label. The object labels and the scene label constitute an image descriptor that we use to calculate a similarity score between the query and database images. We validate our approach on real-world data that exhibit significant changes in camera viewpoints and object placement between the database and query trajectories. The experimental results demonstrate that our method is applicable to a wide range of indoor scenarios without the need for training or fine-tuning.
|
| |
| 10:00-11:30, Paper MoAIP-14.8 | Add to My Program |
| Joint On-Manifold Gravity and Accelerometer Intrinsics Estimation for Inertially Aligned Mapping |
|
| Nemiroff, Ryan | University of California, Los Angeles |
| Chen, Kenny | University of California, Los Angeles |
| Lopez, Brett | University of California, Los Angeles |
Keywords: Localization, Mapping, SLAM
Abstract: Aligning a robot's trajectory or map to the inertial frame is a critical capability that is often difficult to do accurately even though inertial measurement units (IMUs) can observe absolute roll and pitch with respect to gravity. Accelerometer biases and scale factor errors from the IMU's initial calibration are often the major source of inaccuracies when aligning the robot's odometry frame with the inertial frame, especially for low-grade IMUs. Practically, one would simultaneously estimate the true gravity vector, accelerometer biases, and scale factor to improve measurement quality but these quantities are not observable unless the IMU is sufficiently excited. While several methods estimate accelerometer bias and gravity, they do not explicitly address the observability issue nor do they estimate scale factor. We present a fixed-lag factor-graph-based estimator to address both of these issues. In addition to estimating accelerometer scale factor, our method mitigates limited observability by optimizing over a time window an order of magnitude larger than existing methods with significantly lower computational burden. The proposed method, which estimates accelerometer intrinsics and gravity separately from the other states, is enabled by a novel, velocity-agnostic measurement model for intrinsics and gravity, as well as a new method for gravity vector optimization on S2. Accurate IMU state prediction, gravity-alignment, and roll/pitch drift correction are experimentally demonstrated on public and self-collected datasets in diverse environments.
|
| |
| 10:00-11:30, Paper MoAIP-14.9 | Add to My Program |
| I2P-Rec: Recognizing Images on Large-Scale Point Cloud Maps through Bird's Eye View Projections |
|
| Zheng, Shuhang | Zhejiang University |
| Li, Yixuan | Zhejiang University |
| Yu, Zhu | Zhejiang University |
| Yu, Beinan | Zhejiang University |
| Cao, Siyuan | Zhejiang University |
| Wang, Minhang | HAOMO.AI Technology Co., Ltd |
| Xu, Jintao | HAOMO.AI Technology Co., Ltd |
| Ai, Rui | HAOMO.AI Technology Co., Ltd |
| Gu, Weihao | HAOMO.AI Technology Co., Ltd |
| Luo, Lun | Zhejiang University |
| Shen, Hui-liang | Zhejaing University |
Keywords: Localization, SLAM, Recognition
Abstract: Place recognition is an important technique for autonomous cars to achieve full autonomy since it can provide an initial guess to online localization algorithms. Although current methods based on images or point clouds have achieved satisfactory performance, localizing the images on a large-scale point cloud map remains a fairly unexplored problem. This cross-modal matching task is challenging due to the difficulty in extracting consistent descriptors from images and point clouds. In this paper, we propose the I2P-Rec method to solve the problem by transforming the cross-modal data into the same modality. Specifically, we leverage on the recent success of depth estimation networks to recover point clouds from images. We then project the point clouds into Bird's Eye View (BEV) images. Using the BEV image as an intermediate representation, we extract global features with a Convolutional Neural Network followed by a NetVLAD layer to perform matching. The experimental results evaluated on the KITTI dataset show that, with only a small set of training data, I2P-Rec achieves recall rates at Top-1% over 80% and 90%, when localizing monocular and stereo images on point cloud maps, respectively. We further evaluate I2P-Rec on a 1 km trajectory dataset collected by an autonomous logistics car and show that I2P-Rec can generalize well to previously unseen environments.
|
| |
| 10:00-11:30, Paper MoAIP-14.10 | Add to My Program |
| CoPR: Towards Accurate Visual Localization with Continuous Place-Descriptor Regression (I) |
|
| Zaffar, Mubariz | Delft University of Technology |
| Nan, Liangliang | TU Delft |
| Kooij, Julian Francisco Pieter | TU Delft |
Keywords: Localization, Mapping, SLAM, Visual Place Recognition
Abstract: Visual Place Recognition (VPR) is an image-based localization method that estimates the camera location of a query image by retrieving the most similar reference image from a map of geo-tagged reference images. In this work, we look into two fundamental bottlenecks for its localization accuracy: reference map sparseness and viewpoint invariance. Firstly, the reference images for VPR are only available at sparse poses in a map, which enforces an upper bound on the maximum achievable localization accuracy through VPR. We therefore propose Continuous Place-descriptor Regression (CoPR) to densify the map and improve localization accuracy. We study various interpolation and extrapolation models to regress additional place descriptors from only the existing references. Secondly, we compare different feature encoders and show that CoPR presents value for all of them. We evaluate our models on three existing public datasets and report on average around 30% improvement in VPR-based localization accuracy using CoPR, on top of the 15% increase by using a viewpoint-variant loss for the feature encoder. The complementary relation between CoPR and relative pose estimation is also discussed.
|
| |
| 10:00-11:30, Paper MoAIP-14.11 | Add to My Program |
| Complete Closed-Form and Accurate Solution to Pose Estimation from 3D Correspondences |
|
| Malis, Ezio | Inria |
Keywords: Localization, SLAM, Autonomous Vehicle Navigation
Abstract: Computing the pose from 3D data acquired in two different frames is a of high importance for several robotic tasks like odometry, SLAM and place recognition. The pose is generally obtained by solving a least-squares problem given points-to-points, points-to-planes or points to lines correspondences. The non-linear least-squares problem can be solved by iterative optimization or, more efficiently, in closed-form by using solvers of polynomial systems. In this paper, a complete and accurate closed-form solution for a weighted least-squares problem is proposed. Adding weights for each correspondence allow to increase robustness to outliers. Contrary to existing methods, the proposed approach is complete since it is able to solve the problem in any non-degenerate case and it is accurate since it is guaranteed to find the global optimal estimate of the weighted least-squares problem. Simulations and experiments on real data demonstrate the superior accuracy and robustness of the proposed algorithm compared to previous approaches.
|
| |
| 10:00-11:30, Paper MoAIP-14.12 | Add to My Program |
| Toward Consistent and Efficient Map-Based Visual-Inertial Localization: Theory Framework and Filter Design (I) |
|
| Zhang, Zhuqing | Zhejiang University |
| Song, Yang | University of Technology Sydney |
| Huang, Shoudong | University of Technology, Sydney |
| Xiong, Rong | Zhejiang University |
| Wang, Yue | Zhejiang University |
Keywords: Localization, Sensor Fusion, SLAM, Consistent Filter
Abstract: This paper focuses on designing a consistent and efficient filter for visual-inertial localization given a pre-built map. First, we propose a new Lie group with its algebra, based on which a novel invariant extended Kalman filter (invariant EKF) is designed. We theoretically prove that, when we do not consider the uncertainty of map information, the proposed invariant EKF is able to naturally preserve the correct observability properties of the system. To consider the uncertainty of map information, we introduce a Schmidt filter. With the Schmidt filter, the uncertainty of map information can be taken into consideration to avoid over-confident estimation while the computation cost only increases linearly with the size of the map keyframes. In addition, we introduce an easily implemented observability-constrained technique because directly combining the invariant EKF with the Schmidt filter cannot maintain the correct observability properties of the system that considers the uncertainty of map information. Finally, we validate our proposed system's high consistency, accuracy, and efficiency via extensive simulations and real world experiments.
|
| |
| 10:00-11:30, Paper MoAIP-14.13 | Add to My Program |
| WiFi Similarity-Based Odometry (I) |
|
| Ismail, Khairuldanial | Singapore University of Technology and Design |
| Liu, Ran | Southwest University of Science and Technology |
| Athukorala, Achala | Singapore University of Technology and Design |
| Ng, Benny Kai Kiat | Singapore University of Technology and Design |
| Yuen, Chau | Nanyang Technological University |
| Tan, U-Xuan | Singapore University of Techonlogy and Design |
Keywords: Localization
Abstract: Odometry is commonly used in localization applications especially with wheeled platforms since encoders are readily available. It is often used by itself or fused with other sensor data to obtain a better estimate. However, its limitation is its exclusivity to wheeled platforms whereas it is often desired to have similar encoder odometry options on other systems. Given that WiFi is ubiquitous in most commercial and industrial areas, in this paper, a method is proposed for obtaining odometry from WiFi scans for position estimation. The method is not constrained to wheel robots such as the case for wheeled odometry and does not rely on the traditional fingerprinting method. The proposed method involves training a neural network model to predict the distance moved based on features extracted from WiFi scans in the environment. These distances moved are then summed up to obtain the trajectory. Experiments are conducted and the methods are evaluated based on Root Mean Square Error (RMSE). Experimental results showed that the proposed method is able to achieve an RMSE of at most 8.39m for the various test cases.
|
| |
| MoAIP-15 Regular session, Hall E |
Add to My Program |
| Clone of 'Sensor Fusion for SLAM' |
|
| |
| |
| 10:00-11:30, Paper MoAIP-15.1 | Add to My Program |
| LIO-PPF: Fast LiDAR-Inertial Odometry Via Incremental Plane Pre-Fitting and Skeleton Tracking |
|
| Chen, Xingyu | Peking University |
| Wu, Peixi | Peking University |
| Li, Ge | Peking University Shenzhen Graduate School |
| Li, Thomas H. | Advanced Institute of Information Technology, Peking University; |
Keywords: SLAM, Mapping, Localization
Abstract: As a crucial infrastructure of intelligent mobile robots, LiDAR-Inertial odometry (LIO) provides the basic capability of state estimation by tracking LiDAR scans. The high-accuracy tracking generally involves the kNN search, which is used with minimizing the point-to-plane distance. The cost for this, however, is maintaining a large local map and performing kNN plane fit for each point. In this work, we reduce both time and space complexity of LIO by saving these unnecessary costs. Technically, we design a plane pre-fitting (PPF) pipeline to track the basic skeleton of the 3D scene. In PPF, planes are not fitted individually for each scan, let alone for each point, but are updated incrementally as the scene 'flows'. Unlike kNN, the PPF is more robust to noisy and non-strict planes with our iterative Principal Component Analyse (iPCA) refinement. Moreover, a simple yet effective sandwich layer is introduced to eliminate false point-to-plane matches. Our method was extensively tested on a total number of 22 sequences across 5 open datasets, and evaluated in 3 existing state-of-the-art LIO systems. By contrast, LIO-PPF can consume only 36% of the original local map size to achieve up to 4x faster residual computing and 1.92x overall FPS, while maintaining the same level of accuracy. We fully open source our implementation at https://github.com/xingyuuchen/LIO-PPF.
|
| |
| 10:00-11:30, Paper MoAIP-15.2 | Add to My Program |
| EDI: ESKF-Based Disjoint Initialization for Visual-Inertial SLAM Systems |
|
| Wang, Weihan | Stevens Institute of Technology |
| Li, Jiani | Vanderbilt University |
| Ming, Yuhang | Hangzhou Dianzi University |
| Mordohai, Philippos | Stevens Institute of Technology |
Keywords: Visual-Inertial SLAM, SLAM, Localization
Abstract: Visual-inertial initialization can be classified into joint and disjoint approaches. Joint approaches tackle both the visual and the inertial parameters together by aligning observations from feature-bearing points based on IMU integration then use a closed-form solution with visual and acceleration observations to find initial velocity and gravity. In contrast, disjoint approaches independently solve the Structure from Motion (SFM) problem and determine inertial parameters from up-to-scale camera poses obtained from pure monocular SLAM. However, previous disjoint methods have limitations, like assuming negligible acceleration bias impact or accurate rotation estimation by pure monocular SLAM. To address these issues, we propose EDI, a novel approach for fast, accurate, and robust visual-inertial initialization. Our method incorporates an Error-state Kalman Filter (ESKF) to estimate gyroscope bias and correct rotation estimates from monocular SLAM, overcoming dependence on pure monocular SLAM for rotation estimation. To estimate the scale factor without prior information, we offer a closed-form solution for initial velocity, scale, gravity, and acceleration bias estimation. To address gravity and acceleration bias coupling, we introduce weights in the linear least-squares equations, ensuring acceleration bias observability and handling outliers. Extensive evaluation on the EuRoC dataset shows that our method achieves an average scale error of 5.8% in less than 3 seconds, outperforming other state-of-the-art disjoint visual-inertial initialization approaches, even in challenging environments and with artificial noise corruption.
|
| |
| 10:00-11:30, Paper MoAIP-15.3 | Add to My Program |
| SELVO: A Semantic-Enhanced Lidar-Visual Odometry |
|
| Jiang, Kun | UCAS |
| Gao, Shuang | OPPO Research Institute |
| Zhang, Xudong | OPPO Research Institute |
| Li, Jijunnan | OPPO Research Institute |
| Guo, Yandong | OPPO Research Institute |
| Shijie, Liu | Hangzhou Institute for Advanced Study, UCAS |
| Li, Chunlai | Shanghai Institute of Technical Physics (SITP) , Chinese Academy |
| Wang, Jianyu | Shanghai Institute of Technical Physics of the Chinese Academy O |
Keywords: SLAM, Localization, Computer Vision for Automation
Abstract: In the face of complex external environment, single sensor information can no longer meet the accuracy requirements of low-drift SLAM. In this paper, we focus on the fusion scheme of cameras and lidar, and explore the gain of semantic information to SLAM system. A Semantic-Enhanced Lidar-Visual Odometry (SELVO) is proposed to achieve pose estimation with high accuracy and robustness by applying semantics and utilizing strategies of initialization and sensor fusion. In loop closure detection thread, we propose a novel place recognition method based on semantic information to maintain the global consistency of the map. In the back-end, we design a joint optimization framework including visual odometry, lidar odometry and loop closure detection, and innovatively propose to recognize degraded scenes with semantic information. We have conducted a large number of experiments on KITTI and KITTI-360 dataset, and the results show that our system can achieve the high accuracy and competitive performance in comparison with state-of-the-art methods.
|
| |
| 10:00-11:30, Paper MoAIP-15.4 | Add to My Program |
| LIWO: Lidar-Inertial-Wheel Odometry |
|
| Yuan, Zikang | Huazhong University, Wuhan, 430073, China |
| Lang, Fengtian | Huazhong University of Science and Technology |
| Xu, Tianle | Huazhong University of Science and Technology |
| Yang, Xin | Huazhong University of Science and Technology |
Keywords: SLAM, Localization
Abstract: LiDAR-inertial odometry (LIO), which fuses complementary information of a LiDAR and an Inertial Measurement Unit (IMU), is an attractive solution for state estimation.In LIO, both pose and velocity are regarded as state variables that need to be solved. However, the widely-used Iterative Closest Point (ICP) algorithm can only provide constraint for pose, while the velocity can only be constrained by IMU pre-integration. As a result, the velocity estimates inclined to be updated accordingly with the pose results. In this paper, we propose LIWO, an accurate and robust LiDAR-inertialwheel (LIW) odometry, which fuses the measurements from LiDAR, IMU and wheel encoder in a bundle adjustment (BA) based optimization framework. The involvement of a wheel encoder could provide velocity measurement as an important observation, which assists LIO to provide a more accurate state prediction. In addition, constraining the velocity variable by the observation from wheel encoder in optimization can further improve the accuracy of state estimation. Experiment results on two public datasets demonstrate that our system outperforms all state-of-the-art LIO systems in terms of smaller absolute trajectory error (ATE), and embedding a wheel encoder can greatly improve the performance of LIO based on the BA framework.
|
| |
| 10:00-11:30, Paper MoAIP-15.5 | Add to My Program |
| VIW-Fusion: Extrinsic Calibration and Pose Estimation for Visual-IMU-Wheel Encoder System |
|
| Qiao, Chunxiao | Northeastern University, College of Information Science and Engi |
| Zhao, Shuying | Northeastern University |
| Zhang, Yunzhou | Northeastern University |
| Wang, Yahui | UISEE (Beijing) Ltd |
| Zhang, Dan | Uisee Technology (Beijing) Co., Ltd |
Keywords: Visual-Inertial SLAM, Localization, Sensor Fusion
Abstract: The data fusion of camera, IMU, and wheel encoder measurements has proved its effectiveness in localizing ground robots, and obtaining accurate sensor extrinsic parameters is its premise. We propose an extrinsic parameter calibration algorithm and a multi-sensor-based pose estimation algorithm for the camera-IMU-wheel encoder system. First, we propose a joint calibration algorithm for the extrinsic parameters of the camera-IMU-wheel encoder system, which improves the accuracy and robustness of the camera-wheel encoder calibration. We then extend the visual-inertial odometry (VIO) to incorporate the measurements from the wheel encoder and weight the wheel encoder measurements according to angular velocity in global optimization to improve the performance. We further propose a novel method for VIO initialization by integrating wheel encoder information, which significantly reduces the scale error in initialization. We conduct extrinsic parameter calibration experiments on a real self-driving car and validate the performance of our multi-sensor-based localization system on the KAIST dataset and a dataset collected by our self-driving vehicles by performing an exhaust comparison with the state-of-the-art algorithms. Our implementations are open source https://github.com/chunxiaoqiao/VIW-Fusion.git.
|
| |
| 10:00-11:30, Paper MoAIP-15.6 | Add to My Program |
| LiDAR-Inertial SLAM with Efficiently Extracted Planes |
|
| Chen, Chao | Zhejiang University |
| Wu, Hangyu | Zhejiang University |
| Ma, Yukai | Zhejiang Unicersity |
| Lv, Jiajun | Zhejiang University |
| Li, Laijian | Zhejiang University |
| Liu, Yong | Zhejiang University |
Keywords: Mapping, Localization, SLAM
Abstract: This paper proposes a LiDAR-Inertial SLAM with efficiently extracted planes, which couples the planes in the odometry to improve accuracy and in the mapping for consistency. The proposed method consists of three parts: an efficient PointtoLinetoPlane extraction algorithm, a LiDAR-Inertial-Plane tightly coupled odometry, and plane-aided mapping with global planes. Specifically, we leverage the ring field of the LiDAR point cloud to accelerate the region-growing-based plane extraction algorithm. We propose a plane-distance-insensitive criterion for better plane association. We tightly coupled the IMU pre-integration factor, LiDAR odometry factor, and plane factor in the odometry to obtain a more accurate initial pose for mapping. Furthermore, we propose a plane map management strategy based on spatial voxel hashing to improve the speed and accuracy of global map plane associations.Experimental results show that our plane extraction method is efficient, and the proposed plane-aided LiDAR-Inertial SLAM significantly improves the accuracy and consistency compared to the other state-of-the-art algorithms with only a small increase in time consumption.
|
| |
| 10:00-11:30, Paper MoAIP-15.7 | Add to My Program |
| Learning to Map Efficiently by Active Echolocation |
|
| Hu, Xixi | UT Austin |
| Purushwalkam, Senthil | Salesforce Research |
| Harwath, David | UT Austin |
| Grauman, Kristen | UT Austin and Facebook AI Research |
Keywords: Audio-Visual SLAM, SLAM
Abstract: Using visual SLAM to map new environments requires time-consuming visits to all regions for data collection. We propose an approach to estimate maps of areas beyond the visible regions using a cheap and readily available modality of data---sound. We introduce the idea of an active audio-visual mapping agent. Besides collecting visual data, the proposed agent emits sounds during navigation, captures the echoes, and uses them to accurately map unknown areas. We propose a reinforcement learning-based method that simultaneously trains models to 1) estimate a map from the visual data, 2) output navigation actions, 3) output the decision to emit a sound and 4) refine estimated maps using the captured audio. Our agent is trained and tested on 85 real-world homes from the Matterport3D dataset using the Habitat and SoundSpaces simulators for visual and audio data. Our method, unlike visual-data reliant approaches, yields more accurate maps with broader environmental coverage. In addition, compared to an agent that continually emits sounds, we observe that intelligently choosing emph{when} to emit sounds leads to accurate maps obtained with greater efficiency.
|
| |
| 10:00-11:30, Paper MoAIP-15.8 | Add to My Program |
| Visual-LiDAR-Inertial Odometry: A New Visual-Inertial SLAM Method Based on an iPhone 12 Pro |
|
| Ye, Cang | Virginia Commonwealth University |
| Jin, Lingqiu | Virginia Commonwealth University |
Keywords: Visual-Inertial SLAM, Range Sensing
Abstract: As today�s smartphone integrates various imaging sensors and Inertial Measurement Units (IMU) and becomes computationally powerful, there is a growing interest in developing smartphone-based visual-inertial (VI) SLAM methods for robotics and computer vision applications. In this paper, we introduce a new SLAM method, called Visual-LiDAR-Inertial Odometry (VLIO), based on an iPhone 12 Pro. VLIO formulates device pose estimation as an optimization problem that minimizes a cost function based on the residuals of the inertial, visual, and depth measurements. We present the first work that 1) characterizes the iPhone�s LiDAR in depth measurement and identifies the models for the measurement error and standard deviation, and 2) characterizes pose change estimation with LiDAR data. The measurement models are then used to compute the depth-related and visual-feature-related residuals for the cost function. Also, VLIO tracks varying camera intrinsic parameters (CIP) in real-time and uses them in computing these residuals. Both approaches result in more accurate residual terms and thus more accurate pose estimation. The CIP tracking method eliminates the need of a sophisticated model-fitting process that includes camera calibration and paring of the CIPs and IMU measurements with various phone orientations. Experimental results validate the efficacy of VLIO.
|
| |
| 10:00-11:30, Paper MoAIP-15.9 | Add to My Program |
| Optimization-Based VINS: Consistency, Marginalization, and FEJ |
|
| Chen, Chuchu | University of Delaware |
| Geneva, Patrick | University of Delaware |
| Peng, Yuxiang | University of Delaware |
| Lee, Woosik | University of Delaware |
| Huang, Guoquan | University of Delaware |
Keywords: Visual-Inertial SLAM, Localization, SLAM
Abstract: In this work, we present a comprehensive analysis of the application of the First-estimates Jacobian (FEJ) design methodology in nonlinear optimization-based Visual-Inertial Navigation Systems (VINS). The FEJ approach fixes system linearization points to preserve proper observability properties of VINS and has been shown to significantly improve the estimation performance of state-of-the-art filtering-based methods. However, its direct application to optimization-based estimators holds challenges and pitfalls, which we addressed in this paper. Specifically, we carefully examine the observability and its relation to inconsistency and FEJ, based on this, we explain how to properly apply and implement FEJ within four marginalization archetypes commonly used in non-linear optimization-based frameworks. FEJ's effectiveness and applications to VINS are investigated and demonstrate significant performance improvements. Additionally, we offer a detailed discussion of results and guidelines on how to properly implement FEJ in optimization-based estimators.
|
| |
| 10:00-11:30, Paper MoAIP-15.10 | Add to My Program |
| Visual-Inertial-Laser-Lidar (VILL) SLAM: Real-Time Dense RGB-D Mapping for Pipe Environments |
|
| Tian, Yu | Carnegie Mellon University |
| Wang, Luyuan | Carnegie Mellon University |
| Yan, Xinzhi | Carnegie Mellon University |
| Ruan, Fujun | Carnegie Mellon University |
| Ganapathy Subramanian, Jaya Aadityaa | Carnegie Mellon University |
| Choset, Howie | Carnegie Mellon University |
| Li, Lu | Carnegie Mellon University |
Keywords: Visual-Inertial SLAM, RGB-D Perception, Sensor Fusion
Abstract: Robotic solutions for pipeline inspection promise enhancement of human labor by automating data acquisition for pipe condition assessments, which are vital for the early detection of pipe anomalies and the prevention of hazardous leakages and explosions. Through simultaneous localization and mapping (SLAM), colorized 3D reconstructions of the pipe's inner surface can be generated, providing a more comprehensive digital record of the pipes compared to conventional vision-only inspection. Designed for generic environments, most SLAM methods suffer limited accuracy and substantial accumulative drift in confined and featureless spaces such as pipelines, due to a lack of suitable sensor hardware and state estimation techniques. In this research, we present VILL-SLAM: a dense RGB-D SLAM algorithm that combines a monocular camera (V), an inertial sensor (I), a ring-shaped laser profiler (L), and a Lidar (L) into a compact sensor package optimized for in-pipe operations. By fusing complementary visual and depth information from the color camera, laser profiling, and Lidar measurement, our method overcomes the challenges of metric scale mapping in conventional SLAM methods, despite its monocular configuration. To further improve localization accuracy, we utilize the pipe geometry to formulate two unique optimization factors that effectively constrain odometer drift. To validate our method, we conducted real-world experiments in physical pipes, comparing the performance of our approach against other state-of-the-art algorithms. The proposed SLAM framework achieved 6.6 times drift improvement with 0.84% mean odometry drift over 22 meters and a mean pointwise 3D scanning error of 0.88mm in 12-inch diameter pipes. This research represents a significant advancement in miniature in-pipe inspection, localization, and mapping sensing techniques. It has the potential to become a core enabling technology for the next generation of highly capable in-pipe robots, capable of reconstructing photo-realistic 3D pipe scans and providing disruptive pipe locating and georeferencing capabilities.
|
| |
| 10:00-11:30, Paper MoAIP-15.11 | Add to My Program |
| Know What You Don't Know: Consistency in Sliding Window Filtering with Unobservable States Applied to Visual-Inertial SLAM |
|
| Lisus, Daniil | University of Toronto |
| Cohen, Mitchell | McGill University |
| Forbes, James Richard | McGill University |
Keywords: Visual-Inertial SLAM, Autonomous Vehicle Navigation, SLAM
Abstract: Estimation algorithms, such as the sliding window filter, produce an estimate and uncertainty of desired states. This task becomes challenging when the problem involves unobservable states. In these situations, it is critical for the algorithm to ``know what it doesn't know'', meaning that it must maintain the unobservable states as unobservable during algorithm deployment. This letter presents general requirements for maintaining consistency in sliding window filters involving unobservable states. The value of these requirements when designing a navigation solution is experimentally shown within the context of visual-inertial SLAM making use of IMU preintegration.
|
| |
| 10:00-11:30, Paper MoAIP-15.12 | Add to My Program |
| Versatile LiDAR-Inertial Odometry with SE(2) Constraints for Ground Vehicles |
|
| Jiaying, Chen | Nanyang Technological University |
| Wang, Han | Nanyang Technological University |
| Hu, Minghui | Nanyang Technological University |
| Suganthan, Ponnuthurai Nagaratnam | Nanyang Technological University |
Keywords: SLAM, Localization, Industrial Robots
Abstract: LiDAR SLAM has become one of the major localization systems for ground vehicles since LiDAR Odometry And Mapping (LOAM). Many extension works on LOAM mainly leverage one specific constraint to improve the performance, e.g., information from on-board sensors such as loop closure and inertial state; prior conditions such as ground level and motion dynamics. In many robotic applications, these conditions are often known partially, hence a SLAM system can be a comprehensive problem due to the existence of numerous constraints. Therefore, we can achieve a better SLAM result by fusing them properly. In this paper, we propose a hybrid LiDAR-inertial SLAM framework that leverages both the on-board perception system and prior information such as motion dynamics to improve localization performance. In particular, we consider the case for ground vehicles, which are commonly used for autonomous driving and warehouse logistics. We present a computationally efficient LiDAR-inertial odometry method that directly parameterizes ground vehicle poses on SE(2). The out-of-SE(2) motion perturbations are not neglected but incorporated into an integrated noise term of a novel SE(2)-constraints model. For odometric measurement processing, we propose a versatile, tightly coupled LiDAR-inertial odometry to achieve better pose estimation than traditional LiDAR odometry.
|
| |
| 10:00-11:30, Paper MoAIP-15.13 | Add to My Program |
| ESVIO: Event-Based Stereo Visual Inertial Odometry |
|
| Chen, Peiyu | The University of Hong Kong |
| Guan, Weipeng | The University of Hong Kong |
| Lu, Peng | The University of Hong Kong |
Keywords: Visual-Inertial SLAM, Sensor Fusion, Aerial Systems: Perception and Autonomy
Abstract: Event cameras that asynchronously output low-latency event streams provide great opportunities for state estimation under challenging situations. Despite event-based visual odometry having been extensively studied in recent years, most of them are based on monocular and few research on stereo event vision. In this paper, we present ESVIO, the first event-based stereo visual-inertial odometry, which leverages the complementary advantages of event streams, standard images and inertial measurements. Our proposed pipeline achieves spatial and temporal associations between consecutive stereo event streams, thereby obtaining robust state estimation. In addition, the motion compensation method is designed to emphasize the edge of scenes by warping each event to reference moments with IMU and ESVIO back-end. We validate that both ESIO (purely event-based) and ESVIO (event with image-aided) have superior performance compared with other image-based and event-based baseline methods on public and self-collected datasets. Furthermore, we use our pipeline to perform onboard quadrotor flights under low-light environments. A real-world large-scale experiment is also conducted to demonstrate long-term effectiveness. We highlight that this work is a real-time, accurate system that is aimed at robust state estimation under challenging environments.
|
| |
| MoAIP-16 Regular session, Hall E |
Add to My Program |
| Clone of 'Autonomous Agents' |
|
| |
| |
| 10:00-11:30, Paper MoAIP-16.1 | Add to My Program |
| Catch Me If You Hear Me: Audio-Visual Navigation in Complex Unmapped Environments with Moving Sounds |
|
| Younes, Abdelrahman | KIT |
| Honerkamp, Daniel | Albert Ludwigs Universit�t Freiburg |
| Welschehold, Tim | Albert-Ludwigs-Universit�t Freiburg |
| Valada, Abhinav | University of Freiburg |
Keywords: Autonomous Agents, Reactive and Sensor-Based Planning, Reinforcement Learning
Abstract: Audio-visual navigation combines sight and hearing to navigate to a sound-emitting source in an unmapped environment. While recent approaches have demonstrated the benefits of audio input to detect and find the goal, they focus on clean and static sound sources and struggle to generalize to unheard sounds. In this work, we propose the novel dynamic audio-visual navigation benchmark which requires catching a moving sound source in an environment with noisy and distracting sounds, posing a range of new challenges. We introduce a reinforcement learning approach that learns a robust navigation policy for these complex settings. To achieve this, we propose an architecture that fuses audio-visual information in the spatial feature space to learn correlations of geometric information inherent in both local maps and audio signals. We demonstrate that our approach consistently outperforms the current state-of-the-art by a large margin across all tasks of moving sounds, unheard sounds, and noisy environments, on two challenging 3D scanned real-world environments, namely Matterport3D and Replica. The benchmark is available at http://dav-nav.cs.uni-freiburg.de.
|
| |
| 10:00-11:30, Paper MoAIP-16.2 | Add to My Program |
| Joint Imitation Learning of Behavior Decision and Control for Autonomous Intersection Navigation |
|
| Zhu, Zeyu | Key Labarotary of Machine Perception, Peking University |
| Zhao, Huijing | Peking University |
Keywords: Autonomous Agents, Control Architectures and Programming
Abstract: Modern autonomous driving systems face substantial challenges when navigating dense intersections due to the high uncertainty introduced by other road users. Due to the complexity of the task, the autonomous vehicle needs to generate policies at multiple levels of abstraction. However, previous deep imitation learning methods focused on learning control policies while using simple rule-based behavior models. To bridge this gap and achieve human-like driving, we develop a hierarchy of high-level behavior decision and low-level control, where both policies are jointly learned from human demonstrations based on imitation learning. Over 60 hours of driving data from 10 drivers at six intersections was collected. The proposed method is extensively evaluated in challenging intersection scenarios. Empirical results demonstrate the method's superior performance over baselines in terms of task completion and control quality. We demonstrate the importance of learning human-like behavior decisions as well as joint learning of behavior and control policies. The capability of imitating different driving styles is also illustrated.
|
| |
| 10:00-11:30, Paper MoAIP-16.3 | Add to My Program |
| Improving the Performance of Backward Chained Behavior Trees That Use Reinforcement Learning |
|
| Karta�ev, Mart | KTH Royal Institute of Technology |
| Sal�r, Justin | KTH |
| Ogren, Petter | Royal Institute of Technology (KTH) |
Keywords: Behavior-Based Systems, Autonomous Agents, Control Architectures and Programming
Abstract: In this paper we show how to improve the performance of backward chained behavior trees (BTs) that include policies trained with reinforcement learning (RL). BTs represent a hierarchical and modular way of combining control policies into higher level control policies. Backward chaining is a design principle for the construction of BTs that combines reactivity with goal directed actions in a structured way. The backward chained structure has also enabled convergence proofs for BTs, identifying a set of local conditions to be satisfied for the convergence of all trajectories to a set of desired goal states. The key idea of this paper is to improve performance of backward chained BTs by using the conditions identified in a theoretical convergence proof to configure the RL problems for individual controllers. Specifically, previous analysis identified so-called active constraint conditions (ACCs), that should not be violated in order to avoid having to return to work on previously achieved subgoals. We propose a way to set up the RL problems, such that they do not only achieve each immediate subgoal, but also avoid violating the identified ACCs. The resulting performance improvement depends on how often ACC violations occurred before the change, and how much effort, in terms of execution time, was needed to re-achieve them. The proposed approach is illustrated in a dynamic simulation environment.
|
| |
| 10:00-11:30, Paper MoAIP-16.4 | Add to My Program |
| Fast Decision Support for Air Traffic Management at Urban Air Mobility Vertiports Using Graph Learning |
|
| KrisshnaKumar, Prajit | University at Buffalo |
| Witter, Jhoel | University at Buffalo |
| Paul, Steve | University at Buffalo |
| Cho, Hanvit | State University of New York at Buffalo |
| Dantu, Karthik | University of Buffalo |
| Chowdhury, Souma | University at Buffalo, State University of New York |
Keywords: Intelligent Transportation Systems, Multi-Robot Systems, Reinforcement Learning
Abstract: Urban Air Mobility (UAM) promises a new dimension to decongested, safe, and fast travel in urban and suburban hubs. These UAM aircraft are conceived to operate from small airports called vertiports each comprising multiple take-off/landing and battery-recharging spots. Since they might be situated in dense urban areas and need to handle many aircraft landings and take-offs each hour, managing this schedule in real-time becomes challenging for a traditional air-traffic controller but instead calls for an automated solution. This paper provides a novel approach to this problem of Urban Air Mobility - Vertiport Schedule Management (UAM-VSM), which leverages graph reinforcement learning to generate decision-support policies. Here the designated physical spots within the vertiport's airspace and the vehicles being managed are represented as two separate graphs, with feature extraction performed through a graph convolutional network (GCN). Extracted features are passed onto perceptron layers to decide actions such as continue to hover or cruise, continue idling or take-off, or land on an allocated vertiport spot. Performance is measured based on delays, safety (no. of collisions) and battery consumption. Through realistic simulations in AirSim applied to scaled down multi-rotor vehicles, our results demonstrate the suitability of using graph reinforcement learning to solve the UAM-VSM problem and its superiority to basic reinforcement learning (with graph embeddings) or random choice baselines.
|
| |
| 10:00-11:30, Paper MoAIP-16.5 | Add to My Program |
| Scaling Vision-Based End-To-End Autonomous Driving with Multi-View Attention Learning |
|
| Xiao, Yi | Computer Vision Center, Universitat Aut�noma De Barcelona |
| Codevilla, Felipe | Mila/ Independent Robotics |
| Porres, Diego | Computer Vision Center, Universitat Aut�noma De Barcelona |
| Lopez, Antonio M. | Computer Vision Center, Universitat Autonoma De Barcelona |
Keywords: Autonomous Agents, Imitation Learning, Intelligent Transportation Systems
Abstract: On end-to-end driving, human driving demonstrations are used to train perception-based driving models by imitation learning. This process is supervised on vehicle signals (e.g., steering angle, acceleration) but does not require extra costly supervision (human labeling of sensor data). As a representative of such vision-based end-to-end driving models, CILRS is commonly used as a baseline to compare with new driving models. So far, some latest models achieve better performance than CILRS by using expensive sensor suites and/or by using large amounts of human-labeled data for training. Given the difference in performance, one may think that it is not worth pursuing vision-based pure end-to-end driving. However, we argue that this approach still has great value and potential considering cost and maintenance. In this paper, we present CIL++, which improves on CILRS by both processing higher-resolution images using a human-inspired HFOV as an inductive bias and incorporating a proper attention mechanism. CIL++ achieves competitive performance compared to models which are more costly to develop. We propose to replace CILRS with CIL++ as a strong vision-based pure end-to-end driving baseline supervised by only vehicle signals and trained by conditional imitation learning.
|
| |
| 10:00-11:30, Paper MoAIP-16.6 | Add to My Program |
| Value of Assistance for Mobile Agents |
|
| Amuzig, Adi | Technion - Israel Institute of Technology |
| Dovrat, David | Technion |
| Keren, Sarah | Technion - Israel Institute of Technology |
Keywords: Autonomous Agents, Probability and Statistical Methods, Localization
Abstract: Mobile robotic agents often suffer from localization uncertainty which grows with time and with the agents' movement. This can hinder their ability to accomplish their task. In some settings, it may be possible to perform assistive actions that reduce uncertainty about a robot�s location. For example, in a collaborative multi-robot system, a wheeled robot can request assistance from a drone that can fly to its estimated location and reveal its exact location on the map or accompany it to its intended location. Since assistance may be costly and limited, and may be requested by different members of a team, there is a need for principled ways to support the decision of which assistance to provide to an agent and when, as well as to decide which agent to help within a team. For this purpose, we propose Value of Assistance (VOA) to represent the expected cost reduction that assistance will yield at a given point of execution. We offer ways to compute VOA based on estimations of the robot's future uncertainty, modeled as a Gaussian process. We specify conditions under which our VOA measures are valid and empirically demonstrate the ability of our measures to predict the agent's average cost reduction when receiving assistance in both simulated and real-world robotic settings.
|
| |
| 10:00-11:30, Paper MoAIP-16.7 | Add to My Program |
| Feature Explanation for Robust Trajectory Prediction |
|
| Zhai, Xukai | Wuhan University of Technology |
| Hu, Renze | Wuhan University of Technology |
| Yin, Zhishuai | Wuhan Universuty of Technology |
Keywords: Autonomous Agents, AI-Based Methods, Deep Learning Methods
Abstract: Trajectory prediction of neighboring agents is a critical task for high-speed robotics such as autonomous vehicles. In order to obtain fine-grained and robust scene representations, existing works attempt to consider abundant information that is deemed relevant. The cost, however, is the heavy computational burden and more importantly the inevitable interference brought by redundant information. In this paper, we exploit the explainable AI (XAI) techniques and propose a model in the framework of "Encoder-Decoder" named parallel explainable Transformer (PXT) to identify the contributive features for robust trajectory prediction. A two-branch encoder is designed to disentangle the roadway information and agents� historical trajectories for better feature explanation. Two stages of feature explanation are incorporated into the encoder. In the first stage, an explainable Transformer (XT) comprising a Layer-wise Relevance Propagation (LRP)-based interpretation module is designed and implemented in both branches to score and filter the contextual and motion features. In the second stage of interpretation, the ProbSparse attention mechanism is innovatively adopted to measure the level of interactivity with sparsity, so that the relationships among highly interactive agents are focused on. The results on the Argoverse Benchmark show that our proposal achieves state-of-the-art (SOTA) performance without delicate and tedious network design, demonstrating the effectiveness of tracing and retaining contributive features in enhancing the performance of trajectory prediction.
|
| |
| 10:00-11:30, Paper MoAIP-16.8 | Add to My Program |
| Adversarial Driving Behavior Generation Incorporating Human Risk Cognition for Autonomous Vehicle Evaluation |
|
| Liu, Zhen | Jilin University |
| Gao, Hang | Jilin University |
| Ma, Hao | Jilin University |
| Cai, Shuo | Jilin University |
| Hu, Yunfeng | Jilin University |
| Qu, Ting | Jilin University |
| Chen, Hong | Tongji University |
| Gong, Xun | Jilin University |
Keywords: Autonomous Agents, Cognitive Modeling, Reinforcement Learning
Abstract: Autonomous vehicle (AV) evaluation has been the subject of increased interest in recent years both in industry and in academia. This paper focuses on the development of a novel framework for generating adversarial driving behavior of background vehicle interfering against the AV to expose effective and rational risky events. Specifically, the adversarial behavior is learned by a reinforcement learning (RL) approach incorporated with the cumulative prospect theory (CPT) which allows representation of human risk cognition. Then, the extended version of deep deterministic policy gradient (DDPG) technique is proposed for training the adversarial policy while ensuring training stability as the CPT action-value function is leveraged. A comparative case study regarding the cut-in scenario is conducted on a high fidelity Hardware-in-the-Loop (HiL) platform and the results demonstrate the adversarial effectiveness to infer the weakness of the tested AV.
|
| |
| 10:00-11:30, Paper MoAIP-16.9 | Add to My Program |
| Predicting Center of Mass by Iterative Pushing for Object Transportation and Manipulation |
|
| Hyland, Steven Michael | Worcester Polytechnic Institute |
| Xiao, Jing | Worcester Polytechnic Institute (WPI) |
| Onal, Cagdas | WPI |
Keywords: Autonomous Agents, Wheeled Robots, Manipulation Planning
Abstract: Robotic manipulation tasks rely on a plethora of environmental and payload information. One critical piece of information for accurate manipulation is the center of mass (CoM) of the object, which is essential for estimating the dynamic response of the system and determining the payload placement. Traditionally, the CoM of a payload is provided prior to manipulation. In order to create a more robust and comprehensive system, this information should be collected by the robotic agent before or during the task run time. This paper presents a method for approximating the CoM of a planar object using a small-scale mobile robot to inform manipulation tasks. On average, our system is able to converge on a CoM estimate in under 30 seconds in simulation and 20 seconds in experiment, with a relative error of 4.95% and 5.46%, respectively.
|
| |
| 10:00-11:30, Paper MoAIP-16.10 | Add to My Program |
| The Impact of Overall Optimization on Warehouse Automation |
|
| Yoshitake, Hiroshi | Hitachi America Ltd |
| Abbeel, Pieter | UC Berkeley |
Keywords: Discrete Event Dynamic Automation Systems, Reinforcement Learning, Multi-Robot Systems
Abstract: In this study, we propose a novel approach for investigating optimization performance by flexible robot coordination in automated warehouses with multi-agent reinforcement learning (MARL)-based control. Automated systems using robots are expected to achieve efficient operations compared with manual systems in terms of overall optimization performance. However, the impact of overall optimization on performance remains unclear in most automated systems due to a lack of suitable control methods. Thus, we proposed a centralized training-and-decentralized execution MARL framework as a practical overall optimization control method. In the proposed framework, we also proposed a single shared critic, trained with global states and rewards, applicable to a case in which heterogeneous agents make decisions asynchronously. Our proposed MARL framework was applied to the task selection of material handling equipment through automated order picking simulation, and its performance was evaluated to determine how far overall optimization outperforms partial optimization by comparing it with other MARL frameworks and rule-based control methods.
|
| |
| 10:00-11:30, Paper MoAIP-16.11 | Add to My Program |
| Kinematics-Only Differential Flatness Based Trajectory Tracking for Autonomous Racing |
|
| Dighe, Yashom | University at Buffalo, State University of New York |
| Kim, Youngjin | University at Buffalo |
| Rajguru, Smit | State University of New York at Buffalo |
| Turkar, Yash | University at Buffalo |
| Singh, Tarunraj | University at Buffalo |
| Dantu, Karthik | University of Buffalo |
Keywords: Autonomous Agents, Wheeled Robots, Kinematics
Abstract: In autonomous racing, accurately tracking the race line at the limits of handling is essential to guarantee competitiveness. In this study, we show the effectiveness of Differential Flatness based control for high-speed trajectory tracking for car-like robots. We compare the tracking performance of our controller against Nonlinear Model Predictive Control and resource use while running on embedded hardware and show that on average KFC reduces the computation resource usage by 50 % while performing on par with NMPC. Our implementation of the proposed controller, the simulation environment and detailed results is open-sourced on https://github.com/droneslab/
|
| |
| 10:00-11:30, Paper MoAIP-16.12 | Add to My Program |
| LEF: Late-To-Early Temporal Fusion for LiDAR 3D Object Detection |
|
| He, Tong | Waymo LLC |
| Sun, Pei | Waymo |
| Leng, Zhaoqi | Waymo LLC |
| Liu, Chenxi | Waymo |
| Anguelov, Dragomir | Waymo |
| Tan, Mingxing | Waymo Research |
Keywords: Autonomous Agents, Object Detection, Segmentation and Categorization, Semantic Scene Understanding
Abstract: We propose a late-to-early recurrent feature fusion scheme for 3D object detection using temporal LiDAR point clouds. Our main motivation is fusing object-aware latent embeddings into the early stages of a 3D object detector. This feature fusion strategy enables the model to better capture the shapes and poses for challenging objects, compared with learning from raw points directly. Our method conducts late-to-early feature fusion in a recurrent manner. This is achieved by enforcing window-based attention blocks upon temporally calibrated and aligned sparse pillar tokens. Leveraging bird's eye view foreground pillar segmentation, we reduce the number of sparse history features that our model needs to fuse into its current frame by 10x. We also propose a stochastic-length FrameDrop training technique, which generalizes the model to variable frame lengths at inference for improved performance without retraining. We evaluate our method on the widely adopted Waymo Open Dataset and demonstrate improvement on 3D object detection against the baseline model, especially for the challenging category of large objects.
|
| |
| 10:00-11:30, Paper MoAIP-16.13 | Add to My Program |
| Learning Behavior Trees from Planning Experts Using Decision Tree and Logic Factorization |
|
| Gugliermo, Simona | �rebro Univeristy, Scania |
| Schaffernicht, Erik | �rebro University, AASS Research Center |
| Koniaris, Christos | Scania |
| Pecora, Federico | Amazon Robotics |
Keywords: Behavior-Based Systems, Learning from Demonstration, Intelligent Transportation Systems
Abstract: The increased popularity of Behavior Trees (BTs) in different fields of robotics requires efficient methods for learning BTs from data instead of tediously handcrafting them. Recent research in learning from demonstration reported encouraging results that this paper extends, improves and generalizes to arbitrary planning domains. We propose BT-Factor as a new method for learning expert knowledge by representing it in a BT. Execution traces of previously manually designed plans are used to generate a BT employing a combination of decision tree learning and logic factorization techniques originating from circuit design. We test BT-Factor in an industrially-relevant simulation environment from a mining scenario and compare it against a state-of-the-art BT learning method. The results show that our method generates compact BTs easy to interpret, and capable to capture accurately the relations that are implicit in the training data.
|
| |
| MoAIP-17 Regular session, Hall E |
Add to My Program |
| Clone of 'Imitation Learning' |
|
| |
| |
| 10:00-11:30, Paper MoAIP-17.1 | Add to My Program |
| Learning from Guided Play: Improving Exploration for Adversarial Imitation Learning with Simple Auxiliary Tasks |
|
| Ablett, Trevor | University of Toronto |
| Chan, Bryan | University of Alberta |
| Kelly, Jonathan | University of Toronto |
Keywords: Imitation Learning, Reinforcement Learning, Transfer Learning
Abstract: Adversarial imitation learning (AIL) has become a popular alternative to supervised imitation learning that reduces the distribution shift suffered by the latter. However, AIL requires effective exploration during an online reinforcement learning phase. In this work, we show that the standard, naı̈ve approach to exploration can manifest as a suboptimal local maximum if a policy learned with AIL sufficiently matches the expert distribution without fully learning the desired task. This can be particularly catastrophic for manipulation tasks, where the difference between an expert and a non-expert state-action pair is often subtle. We present Learning from Guided Play (LfGP), a framework in which we leverage expert demonstrations of multiple exploratory auxiliary tasks in addition to a main task. The addition of these auxiliary tasks forces the agent to explore states and actions that standard AIL may learn to ignore. Additionally, this particular formulation allows the reusability of expert data between main tasks. Our experimental results in a challenging multitask robotic manipulation domain indicate that LfGP significantly outperforms both AIL and BC, while also being more expert sample efficient than these baselines. To explain this performance gap, we provide further analysis of a toy problem that highlights the coupling between a local maximum and poor exploration, and also visualize the differences between the learned models from AIL and LfGP.
|
| |
| 10:00-11:30, Paper MoAIP-17.2 | Add to My Program |
| Hierarchical Decision Transformer |
|
| Correia, Andr� | Universidade Da Beira Interior and NOVA LINCS |
| Alexandre, Lu�s A. | Univ. Beira Interior and NOVA LINCS |
Keywords: Imitation Learning, Deep Learning Methods, Machine Learning for Robot Control
Abstract: Sequence models in reinforcement learning require task knowledge to estimate the task policy. This paper presents the hierarchical decision transformer (HDT). HDT is a hierarchical behavior cloning algorithm that improves the performance of transformer methods in imitation learning, improving their robustness to tasks with longer episodes and/or sparse rewards, without requiring task knowledge or user interaction currently present in the state-of-the-art. The high-level mechanism guides the low-level controller through the task by selecting sub-goals for the latter to reach. This sequence replaces the returns-to-go of previous methods, improving its performance overall, especially in tasks with longer episodes and scarcer rewards. We validate our method in multiple tasks of OpenAI Gym, D4RL, and RoboMimic benchmarks. Our method outperforms the baselines in twenty three out of thirty one settings of varied horizons and reward frequencies without prior task knowledge, showing the advantages of the hierarchical model approach for learning from demonstrations using a sequence model. We also evaluate the method on a reaching task on a physical robot.
|
| |
| 10:00-11:30, Paper MoAIP-17.3 | Add to My Program |
| ProDMPs: A Unified Perspective on Dynamic and Probabilistic Movement Primitives |
|
| Li, Ge | Karlsruhe Institute of Technology (KIT) |
| Jin, Zeqi | Karlsruhe Institute of Technology |
| Volpp, Michael | Karlsruhe Institute of Technology |
| Otto, Fabian | Bosch Center for AI, University of Tuebingen |
| Lioutikov, Rudolf | Karlsruhe Institute of Technology |
| Neumann, Gerhard | Karlsruhe Institute of Technology |
Keywords: Imitation Learning, Machine Learning for Robot Control
Abstract: Abstract� Movement Primitives (MPs) are a well-known concept to represent and generate modular trajectories. MPs can be broadly categorized into two types: (a) dynamics-based approaches that generate smooth trajectories from any initial state, e. g., Dynamic Movement Primitives (DMPs), and (b) probabilistic approaches that capture higher-order statistics of the motion, e. g., Probabilistic Movement Primitives (ProMPs). To date, however, there is no MP method that unifies both, i. e. that can generate smooth trajectories from an arbitrary initial state while capturing higher-order statistics. In this paper, we introduce a unified perspective of both approaches by solving the ODE underlying the DMPs. We convert expensive online numerical integration of DMPs into position and velocity basis functions that can be used to represent trajectories or trajectory distributions similar to ProMPs while maintaining all the properties of dynamical systems. Since we inherit the properties of both methodologies, we call our proposed model Probabilistic Dynamic Movement Primitives (ProDMPs). Additionally, we embed ProDMPs in deep neural network architecture and propose a new cost function for efficient end-to-end learning of higher-order trajectory statistics. To this end, we leverage Bayesian Aggregation for nonlinear iterative conditioning on sensory inputs. Our proposed model achieves smooth trajectory generation, goal-attractor convergence, correlation analysis, non-linear conditioning, and online re-planing in one framework. Our code can be found in https://github.com/BruceGeLi/ProDMP RAL.
|
| |
| 10:00-11:30, Paper MoAIP-17.4 | Add to My Program |
| Imitation-Guided Multimodal Policy Generation from Behaviourally Diverse Demonstrations |
|
| Zhu, Shibei | Aalto University |
| Kaushik, Rituraj | Aalto University, Finland |
| Kaski, Samuel | Aalto University, University of Manchester |
| Kyrki, Ville | Aalto University |
Keywords: Evolutionary Robotics, Imitation Learning, Reinforcement Learning
Abstract: Learning policies from multiple demonstrators is often difficult because different individuals perform the same task differently due to hidden factors such as preferences. In the context of policy learning, this leads to multimodal policies. Existing policy learning methods often converge to a single solution mode, failing to capture the diversity in the solution space. In this paper, we introduce an imitation-guided reinforcement learning framework to solve the multimodal policy learning problem from a limited number of state-only demonstrations. Then, we propose LfBD (Learning from Behaviourally diverse Demonstration), an algorithm that builds a parameterised solution space to capture the variability in the behaviour space defined by demonstrations. To this end, we define a projection function based on the state density distributions from demonstrations to define such space. Our goal is not only to learn how to solve the task as the human demonstrator but also to extrapolate beyond the provided demonstrations. In addition, we show that with our method, we can perform a post-hoc policy search in the built solution space to recover policies that satisfy specific constraints or to find a policy that matches a given (state-only) behaviour.
|
| |
| 10:00-11:30, Paper MoAIP-17.5 | Add to My Program |
| Model-Based Adversarial Imitation Learning from Demonstrations and Human Reward |
|
| Huang, Jie | Ocean University of China |
| Hao, Jiangshan | Ocean University of China |
| Juan, Rongshun | Tianjin University |
| Gomez, Randy | Honda Research Institute Japan Co., Ltd |
| Nakamura, Keisuke | Honda Research Institute Japan Co., Ltd |
| Li, Guangliang | Ocean University of China |
Keywords: Imitation Learning, Human-Robot Collaboration, Reinforcement Learning
Abstract: Reinforcement learning (RL) can potentially be applied to real-world robot control in complex and uncertain environments. However, it is difficult or even unpractical to design an efficient reward function for various tasks, especially those large and high-dimensional environments. Generative adversarial imitation learning (GAIL) --- a general model-free imitation learning method, allows robots to directly learn policies from expert trajectories in large and high-dimensional environments. However, GAIL is still sample inefficient in terms of environmental interaction. In this paper, to solve this problem, we propose a model-based adversarial imitation learning from demonstrations and human reward (MAILDH), a novel model-based interactive imitation framework combining the advantages of GAIL, interactive RL and model-based RL. We tested our method in eight physics-based discrete and continuous control tasks for RL. Our results show that MAILDH can greatly improve the sample efficiency and robustness compared to the original GAIL.
|
| |
| 10:00-11:30, Paper MoAIP-17.6 | Add to My Program |
| Interpretable Motion Planner for Urban Driving Via Hierarchical Imitation Learning |
|
| Wang, Bikun | Horizon Robotics |
| Wang, Zhipeng | Horizon Robotics |
| Zhu, Chenhao | Horizon Robotics |
| Zhang, Zhiqiang | Horizon Robotics |
| Wang, Zhichen | Horizon Robotics |
| Lin, Penghong | Horizon Robotics |
| Liu, Jingchu | Horizon Robotics |
| Zhang, Qian | Horizon Robotics |
Keywords: Imitation Learning, Computer Vision for Automation, Task and Motion Planning
Abstract: Learning-based approaches have achieved remarkable performance in the domain of autonomous driving. Leveraging the impressive ability of neural networks and large amounts of human driving data, complex patterns and rules of driving behavior can be encoded as a model to benefit the autonomous driving system. Besides, an increasing number of data-driven works have been studied in the decision-making and motion planning module. However, the reliability and the stability of the neural network is still full of uncertainty. In this paper, we introduce a hierarchical planning architecture including a high-level grid-based behavior planner and a low-level trajectory planner, which is highly interpretable and controllable. As the high-level planner is responsible for finding a consistent route, the low-level planner generates a feasible trajectory. We evaluate our method both in closed-loop simulation and real world driving, and demonstrate the neural network planner has outstanding performance in complex urban autonomous driving scenarios
|
| |
| 10:00-11:30, Paper MoAIP-17.7 | Add to My Program |
| Hierarchical Imitation Learning for Stochastic Environments |
|
| Igl, Maximilian | Waymo LLC |
| Shah, Punit | Waymo |
| Mougin, Paul | Waymo |
| Srinivasan, Sirish | ETH Z�rich |
| Gupta, Tarun | University of Oxford |
| White, Brandyn | Waymo |
| Shiarlis, Kyriacos | Waymo |
| Whiteson, Shimon | Waymo |
Keywords: Imitation Learning, Representation Learning, Deep Learning Methods
Abstract: Many applications of imitation learning require the agent to generate the full distribution of observed behaviour in the training data. For example, to evaluate the safety of autonomous vehicles in simulation, accurate and diverse behaviour models of other road users are paramount. Existing methods that improve this distributional realism typically rely on hierarchical policies. These condition the policy on types such as goals or personas that give rise to the multi-modal behaviour. However, such methods are often inappropriate for stochastic environments where the agent must also react to external factors. Because agent types are inferred from the observed future trajectory during training, these environments require that the contributions of internal and external factors to the agent behaviour are disentangled and only internal factors that are under the agent's control are encoded in the type. Encoding future information about external factors leads to inappropriate agent reactions during testing, when the future is unknown and types must be drawn randomly.
|
| |
| 10:00-11:30, Paper MoAIP-17.8 | Add to My Program |
| Efficient Deep Learning of Robust, Adaptive Policies Using Tube MPC-Guided Data Augmentation |
|
| Zhao, Tong | Massachusetts Institute of Technology |
| Tagliabue, Andrea | Massachusetts Institute of Technology |
| How, Jonathan | Massachusetts Institute of Technology |
Keywords: Imitation Learning, Machine Learning for Robot Control, Robust/Adaptive Control
Abstract: The deployment of agile autonomous systems in challenging, unstructured environments requires adaptation capabilities and robustness to uncertainties. Existing robust and adaptive controllers, such as those based on model predictive control (MPC), can achieve impressive performance at the cost of heavy online onboard computations. Strategies that efficiently learn robust and onboard-deployable policies from MPC have emerged, but they still lack fundamental adaptation capabilities. In this work, we extend an existing efficient Imitation Learning (IL) algorithm for robust policy learning from MPC with the ability to learn policies that adapt to challenging model/environment uncertainties. The key idea of our approach consists in modifying the IL procedure by conditioning the policy on a learned lower-dimensional model/environment representation that can be efficiently estimated online. We tailor our approach to the task of learning an adaptive position and attitude control policy to track trajectories under challenging disturbances on a multirotor. Evaluations in simulation show that a high-quality adaptive policy can be obtained in about 1.3 hours. We additionally empirically demonstrate rapid adaptation to in- and out-of-training-distribution uncertainties, achieving a 6.1 cm average position error under wind disturbances that correspond to about 50% of the weight of the robot, and that are 36% larger than the maximum wind seen during training.
|
| |
| 10:00-11:30, Paper MoAIP-17.9 | Add to My Program |
| Masked Imitation Learning: Discovering Environment-Invariant Modalities in Multimodal Demonstrations |
|
| Hao, Yilun | Stanford University |
| Wang, Ruinan | Stanford University |
| Cao, Zhangjie | Stanford University |
| Wang, Zihan | Stanford University |
| Cui, Yuchen | Stanford University |
| Sadigh, Dorsa | Stanford University |
Keywords: Imitation Learning, Learning from Demonstration
Abstract: Multimodal demonstrations provide robots with an abundance of information to make sense of the world. However, such abundance may not always lead to good performance when it comes to learning sensorimotor control policies from human demonstrations. Extraneous data modalities can lead to state over-specification, where the state contains modalities that are not only useless for decision-making but also can change data distribution across environments. State over-specification leads to issues such as the learned policy not generalizing outside of the training data distribution. In this work, we propose Masked Imitation Learning (MIL) to address state over-specification by selectively using informative modalities. Specifically, we design a masked policy network with a binary mask to block certain modalities. We develop a bi-level optimization algorithm that learns this mask to accurately filter over-specified modalities. We demonstrate empirically that MIL outperforms baseline algorithms in simulated domains and effectively recovers the environment-invariant modalities on a multimodal dataset collected on a real robot. Videos and supplemental details are at: https://tinyurl.com/masked-il
|
| |
| 10:00-11:30, Paper MoAIP-17.10 | Add to My Program |
| Does Unpredictability Influence Driving Behavior? |
|
| Samavi, Sepehr | University of Toronto |
| Shkurti, Florian | University of Toronto |
| Schoellig, Angela P. | TU Munich |
Keywords: Human-Aware Motion Planning, Imitation Learning
Abstract: In this paper we investigate the effect of the unpredictability of surrounding cars on an ego-car performing a driving maneuver. We use Maximum Entropy Inverse Reinforcement Learning to model reward functions for an ego-car conducting a lane change in a highway setting. We define a new feature based on the unpredictability of surrounding cars and use it in the reward function. We learn two reward functions from human data: a baseline and one that incorporates our defined unpredictability feature, then compare their performance with a quantitative and qualitative evaluation. Our evaluation demonstrates that incorporating the unpredictability feature leads to a better fit of human-generated test data. These results encourage further investigation of the effect of unpredictability on driving behavior.
|
| |
| 10:00-11:30, Paper MoAIP-17.11 | Add to My Program |
| From Temporal-Evolving to Spatial-Fixing: A Keypoints-Based Learning Paradigm for Visual Robotic Manipulation |
|
| Riou, Kevin | Nantes University |
| Dong, Kaiwen | China University of Mining and Technology, Xuzhou, 221116, China |
| Subrin, K�vin | Universit� De Nantes / LS2N |
| Sun, Yanjing | School of Information and Control Engineering, China University |
| Le Callet, Patrick | Nante University |
Keywords: Imitation Learning, Representation Learning, Sensorimotor Learning
Abstract: The current learning pipelines for robotics ma- nipulation infer movement primitives sequentially along the temporal-evolving axis, which can result in an accumulation of prediction errors and subsequently cause the visual observa- tions to fall out of the training distribution. This paper proposes a novel hierarchical behavior cloning approach which tries to dissociate standard behaviour cloning (BC) pipeline to two stages. The intuition of this approach is to eliminate accumu- lation errors using a fixed spatial representation. At first stage, a high-level planner will be employed to translate the initial observation of the scene into task-specific spatial waypoints. Then, a low-level robotic path planner takes over the task of guiding the robot by executing a set of pre-defined elementary movements or actions known as primitives, with the goal of reaching the previously predicted waypoints. Our hierarchical keypoints-based paradigm aims to simplify existing temporal- evolving approach to a more simple way: directly spatialize the whole sequential primitives as a set of 8D waypoints only from the very first observation. Plentiful experiments demon- strate that our paradigm can achieve comparable results with Reinforcement Learning (RL) and outperforms existing offline BC approaches, with only a single-shot inference from the initial observation. Code and models are available at : https: //github.com/KevinRiou22/spatial-fixing-il
|
| |
| 10:00-11:30, Paper MoAIP-17.12 | Add to My Program |
| Disturbance Injection under Partial Automation: Robust Imitation Learning for Long-Horizon Tasks |
|
| Tahara, Hirotaka | NARA Institute of Science and Technology |
| Sasaki, Hikaru | Nara Institute of Science and Technology |
| Oh, Hanbit | Nara Institute of Science and Technology |
| Anarossi, Edgar | Nara Institute of Science and Technology |
| Matsubara, Takamitsu | Nara Institute of Science and Technology |
Keywords: Imitation Learning, Learning from Demonstration
Abstract: Partial Automation (PA) with intelligent support systems has been introduced in industrial machinery and advanced automobiles to reduce the burden of long hours of human operation. Under PA, operators perform manual operations (providing actions) and operations that switch to automatic/manual mode (mode-switching). Since PA reduces the total duration of manual operation, these two action and mode-switching operations can be replicated by imitation learning with high sample efficiency. To this end, this paper proposes Disturbance Injection under Partial Automation (DIPA) as a novel imitation learning framework. In DIPA, mode and actions (in the manual mode) are assumed to be observables in each state and are used to learn both action and mode-switching policies. The above learning is robustified by injecting disturbances into the operator's actions to optimize the disturbance's level for minimizing the covariate shift under PA. We experimentally validated the effectiveness of our method for long-horizon tasks in two simulations and a real robot environment and confirmed that our method outperformed the previous methods and reduced the demonstration burden.
|
| |
| 10:00-11:30, Paper MoAIP-17.13 | Add to My Program |
| Training Robots without Robots: Deep Imitation Learning for Master-To-Robot Policy Transfer |
|
| Kim, Heecheol | The University of Tokyo |
| Ohmura, Yoshiyuki | The University of Tokyo |
| Nagakubo, Akihiko | National Institute of Advanced IndustrialScienceandTechnology |
| Kuniyoshi, Yasuo | The University of Tokyo |
Keywords: Imitation Learning, Deep Learning in Grasping and Manipulation, Dual Arm Manipulation
Abstract: Deep imitation learning is promising for robot manipulation because it only requires demonstration samples. In this study, deep imitation learning is applied to tasks that require force feedback. However, existing demonstration methods have deficiencies; bilateral teleoperation requires a complex control scheme and is expensive, and kinesthetic teaching suffers from visual distractions from human intervention. This research proposes a new master-to-robot (M2R) policy transfer system that does not require robots to teach force feedback-based manipulation tasks to robots. The human directly demonstrates a task using a controller. This controller resembles the kinematic parameters of the robot arm and uses the same end-effector with force/torque (F/T) sensors to measure the force feedback. Using this controller, the operator can feel force feedback without a bilateral system. The proposed method can overcome domain gaps between the master and robot using gaze-based imitation learning and a simple calibration method. Furthermore, a Transformer is applied to infer policy from F/T sensory input. The proposed system was evaluated on a bottle-cap-opening task that requires force feedback.
|
| |
| 10:00-11:30, Paper MoAIP-17.14 | Add to My Program |
| Imitrob: Imitation Learning Dataset for Training and Evaluating 6D Object Pose Estimators |
|
| Sedlar, Jiri | Czech Technical University |
| Stepanova, Karla | Czech Technical University |
| Skoviera, Radoslav | Czech Institute of Informatics, Robotics, and Cybernetics; Czech |
| Behrens, Jan Kristof | Czech Technical University in Prague, CIIRC |
| Tuna, Mat�� | Comenius University in Bratislava |
| Sejnova, Gabriela | Czech Technical University in Prague |
| Sivic, Josef | Czech Technical University |
| Babuska, Robert | Delft University of Technology |
Keywords: Imitation Learning, Object Detection, Segmentation and Categorization, Computer Vision for Manufacturing
Abstract: This paper introduces a dataset for training and evaluating methods for 6D pose estimation of hand-held tools in task demonstrations captured by a standard RGB camera. Despite the significant progress of 6D pose estimation methods, their performance is usually limited for heavily occluded objects, which is a common case in imitation learning, where the object is typically partially occluded by the manipulating hand. Currently, there is a lack of datasets that would enable the development of robust 6D pose estimation methods for these conditions. To overcome this problem, we collect a new dataset (Imitrob) aimed at 6D pose estimation in imitation learning and other applications where a human holds a tool and performs a task. The dataset contains image sequences of nine different tools and twelve manipulation tasks with two camera viewpoints, four human subjects, and left/right hand. Each image is accompanied by an accurate ground truth measurement of the 6D object pose obtained by the HTC Vive motion tracking device. The use of the dataset is demonstrated by training and evaluating a recent 6D object pose estimation method (DOPE) in various setups. The dataset and code are publicly available at http://imitrob.ciirc.cvut.cz/imitrobdataset.php.
|
| |
| MoAIP-18 Regular session, Hall E |
Add to My Program |
| Clone of 'Calibration and Identification' |
|
| |
| |
| 10:00-11:30, Paper MoAIP-18.1 | Add to My Program |
| Accurate and Interactive Visual-Inertial Sensor Calibration with Next-Best-View and Next-Best-Trajectory Suggestion |
|
| Choi, Christopher | Imperial College London |
| Xu, Binbin | University of Toronto |
| Leutenegger, Stefan | Technical University of Munich |
Keywords: Calibration and Identification, Visual-Inertial SLAM, SLAM
Abstract: Visual-Inertial (VI) sensors are popular in robotics, self-driving vehicles, and augmented and virtual reality applications. In order to use them for any computer vision or state-estimation task, a good calibration is essential. However, collecting informative calibration data in order to render the calibration parameters observable is not trivial for a non-expert. In this work, we introduce a novel VI calibration pipeline that guides a non-expert with the use of a graphical user interface and information theory in collecting informative calibration data with Next-Best-View and Next-Best-Trajectory suggestions to calibrate the intrinsics, extrinsics, and temporal misalignment of a VI sensor. We show through experiments that our method is faster, more accurate, and more consistent than state-of-the-art alternatives. Specifically, we show how calibrations with our proposed method achieve higher accuracy estimation results when used by state-of-the-art VI Odometry as well as VI-SLAM approaches.
|
| |
| 10:00-11:30, Paper MoAIP-18.2 | Add to My Program |
| A ROS-Based Kinematic Calibration Tool for Serial Robots |
|
| Pascal, Caroline | ENSTA Paris |
| Doar�, Olivier | UME ENSTA Paris |
| Chapoutot, Alexandre | ENSTA Paris |
Keywords: Calibration and Identification, Software-Hardware Integration for Robot Systems, Kinematics
Abstract: The use of serial robots for industrial and research purposes is often limited by a flawed positioning accuracy, caused by the differences between the robot nominal model, and the real one. Such an issue can be solved by means of kinematic calibration, which is usually a tedious and intricate task. In this paper, we propose a complete kinematic calibration procedure relying on established geometric modeling, measurements design and parameters identification methods, as well as multiple integration tools, to provide a high adaptability and a simplified handling. The overall process was bundled up in a ROS-based modular and user-friendly package, whose main objective is to offer a smooth and fully integrated framework for the kinematic calibration of serial robots. Our solution was successfully tested using a motion tracking device, and allowed to increase the overall positioning accuracy of two different serial robots by 75% in a matter of hours.
|
| |
| 10:00-11:30, Paper MoAIP-18.3 | Add to My Program |
| FUSE-D: Framework for UAV System-Parameter Estimation with Disturbance Detection |
|
| B�hm, Christoph | University Klagenfurt |
| Weiss, Stephan | Universit�t Klagenfurt |
Keywords: Calibration and Identification, Force and Tactile Sensing, Autonomous Vehicle Navigation
Abstract: Modern unmanned aerial vehicles (UAVs) with sophisticated mechanics ask for extended online system identification to aid model-based controls in task execution. In addition, UAVs in adverse environmental conditions require a more detailed environmental disturbance understanding. The necessary combination of online system identification, sensor suite self-calibration, and external disturbance analysis to tackle these issues holistically is currently an open issue. Our proposed FUSE-D approach combines these elements based on a system model at the rotor-speed level and a single global pose sensor (e.g., a tracking system like Optitrack). Besides sensor intrinsics and extrinsics, the framework allows estimating the UAV�s rotor geometry, mass, moments of inertia, and the rotors� aerodynamic properties, as well as an external force and where it acts on the UAV. The general formulation allows us to extend the approach to an N-rotor (multi-rotor) UAV and classify the type of external disturbance. We perform a detailed non-linear observability analysis for the 43 + 7N states and do a statistically relevant embedded hardware-in-the-loop performance analysis in the realistic simulation environment Gazebo with RotorS.
|
| |
| 10:00-11:30, Paper MoAIP-18.4 | Add to My Program |
| Multiplanar Self-Calibration for Mobile Cobot 3D Object Manipulation Using 2D Detectors and Depth Estimation |
|
| Dang, Tuan | University Taxes at Arlington |
| Nguyen, Khang | University of Texas at Arlington |
| Huber, Manfred | University of Texas at Arlington |
Keywords: AI-Enabled Robotics, Human-Robot Collaboration, Software Architecture for Robotic and Automation
Abstract: Calibration is the first and foremost step in dealing with sensor displacement errors that can appear during extended operation and off-time periods to enable robot object manipulation with precision. In this paper, we present a novel multiplanar self-calibration between the camera system and the robot's end-effector for 3D object manipulation. Our approach first takes the robot end-effector as ground truth to calibrate the camera's position and orientation while the robot arm moves the object in multiple planes in 3D space, and a 2D state-of-the-art vision detector identifies the object's center in the image coordinates system. The transformation between world coordinates and image coordinates is then computed using 2D pixels from the detector and 3D known points obtained by robot kinematics. Next, an integrated stereo-vision system estimates the distance between the camera and the object, resulting in 3D object localization. We test our proposed method on the Baxter robot with two 7-DOF arms and a 2D detector that can run in real time on an onboard GPU. After self-calibrating, our robot can localize objects in 3D using an RGB camera and depth image. The source code is available at https://github.com/tuantdang/calib_cobot.
|
| |
| 10:00-11:30, Paper MoAIP-18.5 | Add to My Program |
| Labelling Lightweight Robot Energy Consumption: A Mechatronics-Based Benchmarking Metric Set |
|
| Heredia, Juan | University of Southern Denmark |
| Kirschner, Robin Jeanne | TU Munich, Institute for Robotics and Systems Intelligence |
| Abdolshah, Saeed | Technical University of Munich |
| Schlette, Christian | University of Southern Denmark (SDU) |
| Haddadin, Sami | Technical University of Munich |
| Mikkel, Kj�rgaard | University of Southern Denmark |
Keywords: Performance Evaluation and Benchmarking, Energy and Environment-Aware Automation, Actuation and Joint Mechanisms
Abstract: Compliance with global guidelines for sustainable and responsible production in modern industry requires a comparative analysis of consumer devices' energy consumption (EC). This also holds true for the newly established generation of lightweight industrial robots (LIRs). To identify potential strategies for energy optimization, standardized benchmarking procedures are required. However, to the best of the authors' knowledge, there is currently no standardized method for benchmarking the EC of manipulators. In response to this need, we have developed a comprehensive benchmarking framework to evaluate the EC of various LIR designs, delving into the theoretical power consumption under both static and dynamic conditions. Our analysis has led to the proposal of seven proposed metrics�three static and four dynamic. The static metrics�controller consumption, joint electronics consumption, and mechanical brakes' consumption�evaluate the maintenance EC of the robot. Meanwhile, we suggest three dynamic metrics that gauge the system�s energy efficiency during motion, with or without payload. We extend this metrics selection by introducing the cost of transportation map for manipulators. For each of the metrics, we suggest a standardized measurement procedure based on state-of-the-art norms and literature. The metric set and experimental procedures are demonstrated using five manipulators (UR3e, UR5e, FR3, M0609, Gen3). Among the results, we can see interesting trends for future optimization of the electronic components and their architecture, e.g., reducing the robot's EC by decentralizing computation via low-consumption onboard controllers for basic tasks and external servers for complex ones.
|
| |
| 10:00-11:30, Paper MoAIP-18.6 | Add to My Program |
| The Role of Absolute Positioning Error in Hand-Eye Calibration and Robotic Guidance Systems: An Analysis |
|
| Chalus, Michal | University of West Bohemia |
| Vanicek, Ondrej | University of West Bohemia |
| Liska, Jindrich | University of West Bohemia |
Keywords: Calibration and Identification, Computer Vision for Manufacturing, Industrial Robots
Abstract: Robotic manipulators deal with serious issues due to their absolute positioning error. This error is usually compensated by an operator in classical robot programming using the teach-and-play method. However, it has a significant effect on accuracy of robotic guidance systems (RGS) that automatically generate process tool trajectory based on the measured data from a sensor. In this paper, we firstly describe the various components of an RGS that affect its overall accuracy. We then introduce a proposed model for the calibration process (MCP) that can be used to analyze the effect of absolute positioning errors on the accuracy of hand-eye calibration, six-point calibration of a process tool and mutual transformation between these tools. Simulations were used to evaluate the proposed MCP model. The results of this analysis are crucial for the practical use of RGS.
|
| |
| 10:00-11:30, Paper MoAIP-18.7 | Add to My Program |
| Robotic Kinematic Calibration with Only Position Data and Consideration of Non-Geometric Errors Using POE-Based Model and Gaussian Mixture Models |
|
| Luo, Xiao | The Chinese University of Hong Kong |
| Xian, Yitian | The Chinese University of Hong Kong |
| Lei, Man Cheong | The Chinese University of Hong Kong |
| Li, Jian | The Chinese University of Hong Kong |
| Xie, Ke | The Chinese University of Hong Kong |
| Zou, Limin | The Chinese University of Hong Kong |
| Li, Zheng | The Chinese University of Hong Kong |
Keywords: Calibration and Identification, Kinematics, Probability and Statistical Methods
Abstract: Kinematic calibration is crucial to improve the positioning accuracy of serial robots. This paper proposes a novel algorithm for robotic kinematic calibration based on an augmented product of exponentials (POE)-based kinematic model using Gaussian mixture models (GMMs) with only position data. In this algorithm, non-geometric errors that cannot be fitted by varying the parameters within the traditional robot model are also considered and compensated. This approach involving a three-stage calibration process which is used to identify the kinematic model parameters and to train the GMMs will be presented in this paper. Finally, this algorithm will be applied to two serial robots for simulation and experimental validation. The effectiveness of the proposed algorithm is verified from both results and significant improvement on error reduction from 26 % to 96% can be observed through the comparison with other existing approaches.
|
| |
| 10:00-11:30, Paper MoAIP-18.8 | Add to My Program |
| MOISST: Multimodal Optimization of Implicit Scene for SpatioTemporal Calibration |
|
| Herau, Quentin | Huawei, University of Burgundy |
| Piasco, Nathan | Huawei Technologies France |
| Bennehar, Moussab | Lirmm - Umr 5506 |
| Roldao, Luis | Huawei |
| Tsishkou, Dzmitry | Huawei Technologies |
| Migniot, Cyrille | U Bourgogne |
| Vasseur, Pascal | Universit� De Picardie Jules Verne |
| Demonceaux, C�dric | Universit� De Bourgogne |
Keywords: Sensor Fusion, Calibration and Identification, Computer Vision for Transportation
Abstract: With the recent advances in autonomous driving and the decreasing cost of LiDARs, the use of multimodal sensor systems is on the rise. However, in order to make use of the information provided by a variety of complimentary sensors, it is necessary to accurately calibrate them. We take advantage of recent advances in computer graphics and implicit volumetric scene representation to tackle the problem of multi-sensor spatial and temporal calibration. Thanks to a new formulation of the Neural Radiance Field (NeRF) optimization, we are able to jointly optimize calibration parameters along with scene representation based on radiometric and geometric measurements. Our method enables accurate and robust calibration from data captured in uncontrolled and unstructured urban environments, making our solution more scalable than existing calibration solutions. We demonstrate the accuracy and robustness of our method in urban scenes typically encountered in autonomous driving scenarios.
|
| |
| 10:00-11:30, Paper MoAIP-18.9 | Add to My Program |
| Automatic Spatial Radar Camera Calibration Via Geometric Constraints with Doppler-Optical Flow Fusion |
|
| Ge, Jintian | Nanyang Technological University |
| Yanxin, Zhou | Nanyang Technological University |
| Lou, Baichuan | Nanyang Technological University |
| Lv, Chen | Nanyang Technological University |
Keywords: Calibration and Identification, Sensor Fusion, Computer Vision for Automation
Abstract: Many intelligent robots use a combination of radar and camera sensors to capture environmental information. Robust and accurate perception highly relies on the result of multi-sensor calibration. Most current spatial calibration methods require a calibration board or a special marker as the target. In this paper, we provide a novel calibration method for RGBD camera and millimeter-wave radar, which automatically estimates the extrinsic parameters. Our proposed method includes the following two stages: rough extrinsic parameters are estimated by using object contours as geometric constraints, and meanwhile, the optimum is reached via optimizing based on the difference of velocity obtained from camera and radar. It only needs an object moving past sensors, but does not require for a calibration board. We validate our method through simulation experiments and real-world experiments. We construct a simulation environment in CARLA to verify the performance of our proposed method against different angles. Furthermore, different levels of zero mean Gaussian noise are added to evaluate the stability of our method. In addition, real-world experiments with different hardware setups are taken to verify the feasibility of our method in real-world conditions.
|
| |
| 10:00-11:30, Paper MoAIP-18.10 | Add to My Program |
| Extrinsic Calibration of Camera to LIDAR Using a Differentiable Checkerboard Model |
|
| Fu, Lanke Frank Tarimo | University of Oxford |
| Chebrolu, Nived | University of Oxford |
| Fallon, Maurice | University of Oxford |
Keywords: Calibration and Identification
Abstract: Multi-modal sensing often involves determining correspondences between each domain�s signals, which in turn depends on the accurate extrinsic calibration of the sensors. Challengingly, the camera-LIDAR sensor modalities are quite dissimilar and the narrow field of view of most commercial LIDARs means that they observe only a partial view of the camera frustum. We present a framework for extrinsic calibration of a camera and a LIDAR using only a simple off-the-shelf checkerboard. It is designed to operate even when the LIDAR observes a significantly truncated portion of the checkerboard. Current state-of-the-art methods often require bespoke manufactured markers or full observation of the entire checkerboard in both camera and LIDAR data which is prohibitive. By contrast, our novel algorithm directly aligns the LIDAR intensity pattern to the camera-detected checkerboard pattern using our differentiable formulation. The key step for achieving accurate extrinsics estimation is the use of the spatial derivatives provided by the differentiable checkerboard pattern, and jointly optimizing over all views. In our experiments, we achieve calibration accuracy in the order of 2-4 mm and demonstrate a 30% error reduction compared to state-of-the-art approaches. We are able to achieve this improvement while using only partial LIDAR views of the checkerboard which allows for a simpler data capture process. We also demonstrate the generalizability of our approach to different combinations of LIDARs and cameras with varying sparsity patterns and noise levels.
|
| |
| 10:00-11:30, Paper MoAIP-18.11 | Add to My Program |
| Graph-Based Visual-Kinematic Fusion and Monte Carlo Initialization for Fast-Deployable Cable-Driven Robots |
|
| Khorrambakht, Rooholla | New York University |
| Damirchi, Hamed | University of Adelaide |
| Dindarloo, Mohammad Reza | K. N. Toosi University of Technology |
| Saki, Aria | K.N Toosi University of Tehcnology |
| Khalilpour, S. Ahmad | K. N. Toosi University of Technology |
| Taghirad, Hamid | K. N. Toosi University of Technology |
| Weiss, Stephan | Universit�t Klagenfurt |
Keywords: Parallel Robots, Calibration and Identification, Sensor Fusion
Abstract: Ease of calibration and high-accuracy task-space state-estimation purely based on onboard sensors is a key requirement for enabling easily deployable cable robots in real-world applications. In this work, we incorporate the onboard camera and kinematic sensors to drive a statistical fusion framework that presents a unified localization and calibration system which requires no initial values for the kinematic parameters. This is achieved by formulating a Monte-Carlo algorithm that initializes a factor-graph representation of the calibration and localization problem. With this, we are able to jointly identify both the kinematic parameters and the visual odometry scale alongside their corresponding uncertainties. We demonstrate the practical applicability of the framework using our state-estimation dataset recorded with the ARAS-CAM suspended cable driven parallel robot, and published as part of this manuscript.
|
| |
| 10:00-11:30, Paper MoAIP-18.12 | Add to My Program |
| P2O-Calib: Camera-LiDAR Calibration Using Point-Pair Spatial Occlusion Relationship |
|
| Wang, Su | Robert Bosch |
| Zhang, Shini | Nanyang Technological University, Singapore |
| Qiu, Xuchong | Bosch |
Keywords: Calibration and Identification, Sensor Fusion, Deep Learning Methods
Abstract: The accurate and robust calibration result of sensors is considered as an important building block to the follow-up research in the autonomous driving and robotics domain. The current works involving extrinsic calibration between 3D LiDARs and monocular cameras mainly focus on target-based and target-less methods. The target-based methods are often utilized offline because of restrictions, such as additional target design and target placement limits. The current target-less methods suffer from feature indeterminacy and feature mismatching in various environments. To alleviate these limitations, we propose a novel target-less calibration approach that is based on the 2D-3D edge point extraction using the occlusion relationship in 3D space. Based on the extracted 2D-3D point pairs, we further propose an occlusion-guided point-matching method that improves the calibration accuracy and reduces computation costs. To validate the effectiveness of our approach, we evaluate the method performance qualitatively and quantitatively on real images from the KITTI dataset. The results demonstrate that our method outperforms the existing target-less methods and achieves low error and high robustness that can contribute to the practical applications relying on high-quality Camera-LiDAR calibration.
|
| |
| 10:00-11:30, Paper MoAIP-18.13 | Add to My Program |
| Wrench Estimation of Modular Manipulator with External Actuation and Joint Locking |
|
| Kim, Yonghyeok | Seoul National University |
| Lee, Hasun | Seoul National University |
| Lee, Jeongseob | Seoul National University |
| Lee, Dongjun | Seoul National University |
Keywords: Aerial Systems: Mechanics and Control, Distributed Robot Systems, Force Control
Abstract: This paper proposes an external wrench estimation method for modular manipulator, where each link module is driven with external actuation (e.g., rotors, thrusters) and inter-module joints can be locked to increase end-effector stiffness or workforce of the manipulator. For such systems, the commonly-used momentum-based observer (MBO) is not suitable due to the presence of unknown joint locking (JL) torque and also the degeneracy of Jacobian transpose relation with the system degree-of-freedom (DOF) becoming less than six with the joint locking. To overcome this, we propose two novel external wrench estimation algorithms: distributed algorithm based on recursive Newton-Euler dynamics and centralized algorithm based on D'Alembert's principle, both using an F/T (force/torque) sensor at the base. Experiments are conducted to demonstrate the effectiveness of the proposed algorithms.
|
| |
| 10:00-11:30, Paper MoAIP-18.14 | Add to My Program |
| Observability-Aware Online Multi-Lidar Extrinsic Calibration |
|
| Das, Sandipan | KTH |
| af Klinteberg, Ludvig | Scania |
| Fallon, Maurice | University of Oxford |
| Chatterjee, Saikat | KTH Royal Institute of Technology |
Keywords: Calibration and Identification, Intelligent Transportation Systems, Localization
Abstract: Accurate and robust extrinsic calibration is necessary for deploying autonomous systems which need multiple sensors for perception. In this paper, we present a robust system for real-time extrinsic calibration of multiple lidars in vehicle base frame without the need for any fiducial markers or features. We base our approach on matching absolute GNSS and estimated lidar poses in real-time. Comparing rotation components allows us to improve the robustness of the solution than traditional least-square approach comparing translation components only. Additionally, instead of comparing all corresponding poses, we select poses comprising maximum mutual information based on our novel observability criteria. This allows us to identify a subset of the poses helpful for real-time calibration. We also provide stopping criteria for ensuring calibration completion. To validate our approach extensive tests were carried out on data collected using Scania test vehicles (7 sequences for a total of ~ 6.5 Km). The results presented in this paper show that our approach is able to accurately determine the extrinsic calibration for various combinations of sensor setups.
|
| |
| MoAIP-19 Regular session, Hall E |
Add to My Program |
| Clone of 'Deep Learning Methods I' |
|
| |
| |
| 10:00-11:30, Paper MoAIP-19.1 | Add to My Program |
| Recognising Affordances in Predicted Futures to Plan with Consideration of Non-Canonical Affordance Effects |
|
| Arnold, Solvi | Shinshu Univeristy |
| Kuroishi, Mami | EPSON AVASYS |
| Karashima, Rin | EPSON AVASYS |
| Adachi, Tadashi | EPSON AVASYS |
| Yamazaki, Kimitoshi | Shinshu University |
Keywords: Deep Learning Methods, Task and Motion Planning, Neurorobotics
Abstract: We propose a novel system for action sequence planning based on a combination of affordance recognition and a neural forward model predicting the effects of affordance execution. By performing affordance recognition on predicted futures, we avoid reliance on explicit affordance effect definitions for multi-step planning. Because the system learns affordance effects from experience data, the system can foresee not just the canonical effects of an affordance, but also situation-specific side-effects. This allows the system to avoid planning failures due to such non-canonical effects, and makes it possible to exploit non-canonical effects for realising a given goal. We evaluate the system in simulation, on a set of test tasks that require consideration of canonical and non-canonical affordance effects.
|
| |
| 10:00-11:30, Paper MoAIP-19.2 | Add to My Program |
| AV-PedAware: Self-Supervised Audio-Visual Fusion for Dynamic Pedestrian Awareness |
|
| Yang, Yizhuo | Nangyang Technological Univercity |
| Yuan, Shenghai | Nanyang Technological University |
| Cao, Muqing | Nanyang Technological University |
| Yang, Jianfei | Nanyang Technological University |
| Xie, Lihua | NanyangTechnological University |
Keywords: Deep Learning Methods, Sensor Fusion, Human Detection and Tracking
Abstract: In this study, we introduce AV-PedAware, a self-supervised audio-visual fusion system designed to improve dynamic pedestrian awareness for robotics applications. Pedestrian awareness is a critical requirement in many robotics applications. However, traditional approaches that rely on cameras and LIDARs to cover multiple views can be expensive and susceptible to issues such as changes in illumination, occlusion, and weather conditions. Our proposed solution replicates human perception for 3D pedestrian detection using low-cost audio and visual fusion. This study represents the first attempt to employ audio-visual fusion to monitor footstep sounds for the purpose of predicting the movements of pedestrians in the vicinity. The system is trained through self-supervised learning based on LIDAR-generated labels, making it a cost-effective alternative to LIDAR-based pedestrian awareness. AV-PedAware achieves comparable results to LIDAR-based systems at a fraction of the cost. By utilizing an attention mechanism, it can handle dynamic lighting and occlusions, overcoming the limitations of traditional LIDAR and camera-based systems. To evaluate our approach's effectiveness, we collected a new multimodal pedestrian detection dataset and conducted experiments that demonstrate the system's ability to provide reliable 3D detection results using only audio and visual data, even in extreme visual conditions. We will make our collected dataset and source code available online for the community to encourage further development in the field of robotics perception systems.
|
| |
| 10:00-11:30, Paper MoAIP-19.3 | Add to My Program |
| A Multitask and Kernel Approach for Learning to Push Objects with a Target-Parameterized Deep Q-Network |
|
| Ewerton, Marco | Idiap Research Institute |
| Villamizar, Michael | IDIAP |
| Jankowski, Julius | Idiap Research Institute and EPFL |
| Calinon, Sylvain | Idiap Research Institute |
| Odobez, Jean-Marc | IDIAP |
Keywords: Deep Learning Methods, Deep Learning for Visual Perception, Perception for Grasping and Manipulation
Abstract: Pushing is an essential motor skill involved in several manipulation tasks, and has been an important research topic in robotics. Recent works have shown that Deep Q-Networks (DQNs) can learn pushing policies (when, where to push, and how) to solve manipulation tasks, potentially in synergy with other skills (e.g. grasping). Nevertheless, DQNs often assume a fixed setting and task, which may limit their deployment in practice. Furthermore, they suffer from sparse-gradient backpropagation when the action space is very large, a problem exacerbated by the fact that they are trained to predict state-action values based on a single reward function aggregating several facets of the task, rendering the model training challenging. To address these issues, we propose a multi-head target-parameterized DQN to learn robotic manipulation tasks, in particular pushing policies, and make the following contributions: i) we show that learning to predict different reward and task aspects can be beneficial compared to predicting a single value function where reward factors are not disentangled; ii) we study several alternatives to generalize a policy by encoding the target parameters either into the network layers or visually in the input; iii) we propose a kernelized version of the loss function, allowing to obtain better, faster and more stable training performance. Extensive experiments on simulations validate our design choices, and we show that our architecture learned on simulated data can achieve high performance in a real-robot setup involving a Franka Emika robot arm and unseen objects.
|
| |
| 10:00-11:30, Paper MoAIP-19.4 | Add to My Program |
| DRKF: Distilled Rotated Kernel Fusion for Efficient Rotation Invariant Descriptors in Local Feature Matching |
|
| Huang, Ranran | Meituan |
| Cai, Jiancheng | Meituan |
| Li, Chao | Beijing University of Posts and Telecommunications |
| Wu, Zhuoyuan | Meituan |
| Liu, Xinmin | Meituan |
| Chai, Zhenhua | Meituan |
Keywords: Deep Learning Methods, Visual Learning, Deep Learning for Visual Perception
Abstract: The performance of local feature descriptors degrades in the presence of large rotation variations. To address this issue, we present an efficient approach to learning rotation invariant descriptors. Specifically, we propose Rotated Kernel Fusion (RKF) which imposes rotations on the convolution kernel to improve the inherent nature of CNN. Since RKF can be processed by the subsequent re-parameterization, no extra computational costs will be introduced in the inference stage. Moreover, we present Multi-oriented Feature Aggregation (MOFA) which aggregates features extracted from multiple rotated versions of the input image and can provide auxiliary knowledge for the training of RKF by leveraging the distillation strategy. We refer to the distilled RKF model as DRKF. Besides the evaluation on a rotation-augmented version of the public dataset HPatches, we also contribute a new dataset named DiverseBEV which is collected during the drone�s flight and consists of bird�s eye view images with large viewpoint changes and camera rotations. Extensive experiments show that our method can outperform other state-of-the-art techniques when exposed to large rotation variations.
|
| |
| 10:00-11:30, Paper MoAIP-19.5 | Add to My Program |
| Efficient Q-Learning Over Visit Frequency Maps for Multi-Agent Exploration of Unknown Environments |
|
| Chen, Xuyang | Cognitive Robot Autonomy and Learning Lab |
| Iyer, Ashvin | Purdue University |
| Wang, Zixing | Purdue University |
| Qureshi, Ahmed H. | Purdue University |
Keywords: Deep Learning Methods, Reinforcement Learning, Multi-Robot Systems
Abstract: The robot exploration task has been widely studied with applications spanning from novel environment mapping to item delivery. For some time-critical tasks, such as rescue catastrophes, the agent is required to explore as efficiently as possible. Recently, Visit Frequency-based map representation achieved great success in such scenarios by discouraging repetitive visits with a frequency-based penalty. However, its relatively large size and single-agent settings hinder its further development. In this context, we propose Integrated Visit Frequency Map, which encodes identical information as Visit Frequency Map with a more compact size, and a visit frequency-based multi-agent information exchange and control scheme that is able to accommodate both representations. Through tests in diverse settings, the results indicate our proposed methods can achieve a comparable level of performance of VFM with lower bandwidth requirements and generalize well to different multi-agent setups including real-world environments.
|
| |
| 10:00-11:30, Paper MoAIP-19.6 | Add to My Program |
| Real-Time Trajectory-Based Social Group Detection |
|
| Jahangard, Simindokht | Monash University |
| Hayat, Munawar | Monash University |
| Rezatofighi, Hamid | Monash University |
Keywords: Deep Learning Methods, Human and Humanoid Motion Analysis and Synthesis, Human Detection and Tracking
Abstract: Social group detection is a crucial aspect of various robotic applications, including robot navigation and human-robot interactions. To date, a range of model-based techniques have been employed to address this challenge, such as the F-formation and trajectory similarity frameworks. However, these approaches often fail to provide reliable results in crowded and dynamic scenarios. Recent advancements in this area have mainly focused on learning-based methods, such as deep neural networks that use visual content or human pose. Although visual content based methods have demonstrated promising performance on large-scale datasets, their computational complexity poses a significant barrier to their practical use in real-time applications. To address these issues, we propose a simple and efficient framework for social group detection. Our approach explores the impact of motion trajectory on social grouping and utilizes a novel, reliable, and fast data-driven method. We formulate the individuals in a scene as a graph, where the nodes are represented by LSTM-encoded trajectories and the edges are defined by the distances between each pair of tracks. Our framework employs a modified graph transformer module and graph clustering losses to detect social groups. Our experiments on the popular JRDB-Act dataset reveal noticeable improvements in performance, with relative improvements ranging from 2% to 11%. Furthermore, our framework is significantly faster, with up to 12x faster inference times compared to state-of-the-art methods under the same computation resources. These results demonstrate that our proposed method is suitable for real-time robotic applications.
|
| |
| 10:00-11:30, Paper MoAIP-19.7 | Add to My Program |
| Point2Point : A Framework for Efficient Deep Learning on Hilbert Sorted Point Clouds with Applications in Spatio-Temporal Occupancy Prediction |
|
| Pandhare, Athrva Atul | University of Pennsylvania |
Keywords: Deep Learning Methods, Deep Learning for Visual Perception, Mapping
Abstract: The irregularity and permutation invariance of point cloud data pose challenges for effective learning. Conventional methods for addressing this issue involve converting raw point clouds to intermediate representations such as 3D voxel grids or range images. While such intermediate representations solve the problem of permutation invariance, they can result in significant loss of information. Approaches that do learn on raw point clouds either have trouble in resolving neighborhood relationships between points or are too complicated in their formulation. In this paper, we propose a novel approach to representing point clouds as a locality preserving 1D ordering induced by the Hilbert space-filling curve. We also introduce Point2Point, a neural architecture that can effectively learn on Hilbert-sorted point clouds. We show that Point2Point shows competitive performance on point cloud segmentation and generation tasks. Finally, we show the performance of Point2Point on Spatio-temporal Occupancy prediction from Point clouds.
|
| |
| 10:00-11:30, Paper MoAIP-19.8 | Add to My Program |
| Motion Planning Diffusion: Learning and Planning of Robot Motions with Diffusion Models |
|
| Mueller Carvalho, Joao Andre | Technische Universit�t Darmstadt |
| Le, An Thai | Technische Universit�t Darmstadt |
| Baierl, Mark | Technical University of Darmstadt |
| Koert, Dorothea | Technische Universitaet Darmstadt |
| Peters, Jan | Technische Universit�t Darmstadt |
Keywords: Deep Learning Methods, Learning from Experience
Abstract: Learning priors on trajectory distributions can help accelerate robot motion planning optimization. Given previously successful plans, learning trajectory generative models as priors for a new planning problem is highly desirable. Prior works propose several ways on utilizing this prior to bootstrapping the motion planning problem. Either sampling the prior for initializations or using the prior distribution in a maximum-a-posterior formulation for trajectory optimization. In this work, we propose learning diffusion models as priors. We then can sample directly from the posterior trajectory distribution conditioned on task goals, by leveraging the inverse denoising process of diffusion models. Furthermore, diffusion has been recently shown to effectively encode data multimodality in high-dimensional settings, which is particularly well-suited for large trajectory dataset. To demonstrate our method efficacy, we compare our proposed method - Motion Planning Diffusion - against several baselines in simulated planar robot and 7-dof robot arm manipulator environments. To assess the generalization capabilities of our method, we test it in environments with previously unseen obstacles. Our experiments show that diffusion models are strong priors to encode high-dimensional trajectory distributions of robot motions.
|
| |
| 10:00-11:30, Paper MoAIP-19.9 | Add to My Program |
| Active Task Randomization: Learning Robust Skills Via Unsupervised Generation of Diverse and Feasible Tasks |
|
| Fang, Kuan | University of California, Berkeley |
| Migimatsu, Toki | Stanford University |
| Mandlekar, Ajay Uday | NVIDIA |
| Fei-Fei, Li | Stanford University |
| Bohg, Jeannette | Stanford University |
Keywords: Deep Learning Methods, Deep Learning in Grasping and Manipulation, Representation Learning
Abstract: Solving real-world manipulation tasks requires robots to be equipped with a repertoire of skills that can be applied to diverse scenarios. While learning-based methods can enable robots to acquire skills from interaction data, their success relies on collecting training data that covers the diverse range of tasks that the robot may encounter during the test time. However, creating diverse and feasible training tasks often requires extensive domain knowledge and non-trivial manual labor. We introduce Active Task Randomization (ATR), an approach that learns robust skills through the unsupervised generation of training tasks. ATR selects suitable training tasks�which consist of an environment configuration and manipulation goal�by actively balancing their diversity and feasibility. In doing so, ATR effectively creates a curriculum that gradually increases task diversity while maintaining a moderate level of feasibility, which leads to more complex tasks as the skills become more capable. ATR predicts task diversity and feasibility with a compact task representation that is learned concurrently with the skills. The selected tasks are then procedurally generated in simulation with a graph-based parameterization. We demonstrate that the learned skills can be composed by a task planner to solve unseen sequential manipulation problems based on visual inputs. Compared to baseline methods, ATR can achieve superior success rates in single-step and sequential manipulation tasks.
|
| |
| 10:00-11:30, Paper MoAIP-19.10 | Add to My Program |
| Robust Self-Supervised Extrinsic Self-Calibration |
|
| Kanai, Takayuki | Toyota Research Institute |
| Vasiljevic, Igor | Toyota Research Institute |
| Guizilini, Vitor | Toyota Research Institute |
| Gaidon, Adrien | Toyota Research Institute |
| Ambrus, Rares | Toyota Research Institute |
Keywords: Deep Learning Methods, Calibration and Identification
Abstract: Autonomous vehicles and robots need to operate over a wide variety of scenarios in order to complete tasks efficiently and safely. Multi-camera self-supervised monocular depth estimation from videos is a promising way to reason about the environment, as it generates metrically scaled geometric predictions from visual data without requiring additional sensors. However, most works assume well-calibrated extrinsics to fully leverage this multi-camera setup, even though accurate and efficient calibration is still a challenging problem. In this work, we introduce a novel method for extrinsic calibration that builds upon the principles of self-supervised monocular depth and ego-motion learning. Our proposed curriculum learning strategy uses monocular depth and pose estimators with velocity supervision to estimate extrinsics, and then jointly learns extrinsic calibration along with depth and pose for a set of overlapping cameras rigidly attached to a moving vehicle. Experiments on a benchmark multi-camera dataset (DDAD) demonstrate that our method enables self-calibration in various scenes robustly and efficiently compared to a traditional vision-based pose estimation pipeline. Furthermore, we demonstrate the benefits of extrinsics self-calibration as a way to improve depth prediction via joint optimization. Project page: https://sites.google.com/tri.global/tri-sesc
|
| |
| 10:00-11:30, Paper MoAIP-19.11 | Add to My Program |
| Do More with Less: Single-Model, Multi-Goal Architectures for Resource-Constrained Robots |
|
| Wang, Zili | Boston University |
| Threatt, Drew | Boston University |
| Andersson, Sean | Boston University |
| Tron, Roberto | Boston University |
Keywords: Deep Learning Methods, Autonomous Agents
Abstract: Deep learning methods are widely used in robotic applications. By learning from prior experience, the robot can abstract knowledge of the environment, and use this knowledge to accomplish different goals, such as object search, frontier exploration, or scene understanding, with a smaller amount of resources than might be needed without that knowledge. Most existing methods typically require a significant amount of sensing, which in turn has significant costs in terms of power consumption for acquisition and processing, and typically focus on models that are tuned for each specific goal, leading to the need to train, store and run each one separately. These issues are particularly important in a resource-constrained setting, such as with small-scale robots or during long-duration missions. We propose a single, multi-task deep learning architecture that takes advantage of the structure of the partial environment to predict different abstractions of the environment (thus reducing the need for rich sensing), and to leverage these predictions to simultaneously achieve different high-level goals (thus sharing computation between goals). As an example application of the proposed architecture, we consider the specific example of a robot equipped with a 2-D laser scanner and an object detector, tasked with searching for an object (such as an exit) in a residential building while constructing a topological map that can be used for future missions. The prior knowledge of the environment is encoded using a U-Net deep network architecture. In this context, our work leads to an object search algorithm that is complete, and that outperforms a more traditional frontier-based approach. The topological map we produce uses scene trees to qualitatively represent the environment as a graph at a fraction of the cost of existing SLAM-based solutions. Our results demonstrate that it is possible to extract multi-task semantic information that is useful for navigation and mapping directly from bare-bone, non-semantic measurements.
|
| |
| 10:00-11:30, Paper MoAIP-19.12 | Add to My Program |
| Enhancing State Estimation in Robots: A Data-Driven Approach with Differentiable Ensemble Kalman Filters |
|
| Liu, Xiao | Arizona State University |
| Clark, Geoffrey | ASU |
| Campbell, Joseph | Carnegie Mellon University |
| Zhou, Yifan | Arizona State University |
| Ben Amor, Heni | Arizona State University |
Keywords: Deep Learning Methods, Deep Learning for Visual Perception, Deep Learning in Grasping and Manipulation
Abstract: This paper introduces a novel state estimation framework for robots using differentiable ensemble Kalman filters (DEnKF). DEnKF is a reformulation of the traditional ensemble Kalman filter that employs stochastic neural networks to model the process noise implicitly. Our work is an extension of previous research on differentiable filters, which has provided a strong foundation for our modular and end-to-end differentiable framework. This framework enables each component of the system to function independently, leading to improved flexibility and versatility in implementation. Through a series of experiments, we demonstrate the flexibility of this model across a diverse set of real-world tracking tasks, including visual odometry and robot manipulation. Moreover, we show that our model effectively handles noisy observations, is robust in the absence of observations, and outperforms state-of-theart differentiable filters in terms of error metrics. Specifically, we observe a significant improvement of at least 59% in translational error when using DEnKF with noisy observations. Our results underscore the potential of DEnKF in advancing state estimation for robotics. Code for DEnKF is available at https://github.com/ir-lab/DEnKF
|
| |
| 10:00-11:30, Paper MoAIP-19.13 | Add to My Program |
| Self-Supervised Category-Level 6D Object Pose Estimation with Optical Flow Consistency |
|
| Zaccaria, Michela | E80Group S.p.A., University of Parma |
| Manhardt, Fabian | Google |
| Di, Yan | Technical University of Munich |
| Tombari, Federico | Technische Universit�t M�nchen |
| Aleotti, Jacopo | University of Parma |
| Giorgini, Mikhail | University of Parma, Elettric 80 S.p.A |
Keywords: Deep Learning for Visual Perception, Deep Learning Methods, RGB-D Perception
Abstract: Category-level 6D object pose estimation aims at determining the pose of an object of a given category. Most current state-of-the-art methods require a significant amount of real training data to supervise their models. Moreover, annotating the 6D pose is very time consuming, error-prone, and it does not scale well to a large amount of object classes. Therefore, a handful of methods have recently been proposed to use unlabelled data to establish weak supervision. In this letter we propose a self-supervised method that leverages the 2D optical flow as a proxy for supervising the 6D pose. To this purpose, we estimate the 2D optical flow between consecutive frames based on the pose estimation. Then, we harness an off-the-shelf optical flow method to enable weak supervision using a 2D-3D optical flow based consistency loss. Experiments show that our approach for self-supervised learning yields state-of-the-art performance on the NOCS benchmark, and it reaches comparable results with some fully-supervised approaches.
|
| |
| MoAIP-20 Late breaking, Hall E |
Add to My Program |
| Late Breaking Posters I |
|
| |
| |
| 10:00-11:30, Paper MoAIP-20.1 | Add to My Program |
| Variable Transmission between Series Elastic Actuator and Quasi-Direct Drive Actuator in One Actuator for Dynamic Interaction Tasks |
|
| Hur, Jungwoo | Sogang University |
| Song, Hangyeol | Sogang University |
| Lee, TaeYun | Sogang University |
| Lee, Wonjun | Sogang University |
| Kim, Jongsoo | Sogang University |
| Jeong, Seokhwan | Mechanical Eng., Sogang University |
Keywords: Actuation and Joint Mechanisms, Mechanism Design, Compliant Joints and Mechanisms
Abstract: This preliminary work introduces a variable transmission-actuation module between a series elastic actuator and a quasi-direct drive actuator. It enables a wide range of torque-speed capabilities by interchanging two transmission paths within a single electric motor for highly dynamic interaction tasks in a wide operational range. The two reduction ratios (9:1 and 63:1) are implemented using two layers of planetary gearsets. A torsion spring is attached to the high-gear-ratio output rotor, but the low-gear-ratio side is directly connected to the output. An active gear shift sleeve facilitates the interchange of the gear ratio, and its performance was demonstrated through an experiment.
|
| |
| 10:00-11:30, Paper MoAIP-20.2 | Add to My Program |
| Design and Control of a Foldable Robot Arm in a Drone for Cleaning Solar Panels |
|
| Choi, Yejin | Chungnam National University |
| Jung, Seul | Chungnam National University |
Keywords: Aerial Systems: Applications, Compliance and Impedance Control, Soft Robot Applications
Abstract: This paper presents the design and control task of a foldable robot arm for a drone to clean solar panels. The foldable robot arm is designed for the drone manipulation. The foldable robot arm is controlled by a linear actuator to fold and extend the links. In order for the robot to clean the solar panel surface, force control is applied to the cleaning device with the flexure device to suppress the vibration. Experimental studies of solar panel cleaning tasks by the foldable robot arm were conducted.
|
| |
| 10:00-11:30, Paper MoAIP-20.3 | Add to My Program |
| Transformable Multirotor Airframe Design for Infrastructure Inspection |
|
| Paul, Hannibal | Ritsumeikan University |
| Rosales Martinez, Ricardo | Ritsumeikan University |
| Shimonomura, Kazuhiro | Ritsumeikan University |
Keywords: Aerial Systems: Applications, Field Robots, Aerial Systems: Mechanics and Control
Abstract: Aerial manipulators attached to UAVs are useful for performing tasks in hard-to-reach areas, such as inspections of bridges and tunnels. However, the range of motion of the manipulator tip is typically limited to the location of its attachment on the airframe. In the proposed system, a design is developed to enable the manipulator tip to reach all directions surrounding the UAV. A basic transformable airframe design is used as the manipulator body, and additional actuators are employed to maintain the rotors' axis.The proposed design ensures proper alignment of the attached measuring tip with the surface for accurate data collection. It also ensures that the rotor thrust can support the UAV's maximum payload even when the UAV is tilted. The efficiency and usefulness of the system are demonstrated through experimental tests.
|
| |
| 10:00-11:30, Paper MoAIP-20.4 | Add to My Program |
| Towards Robust Cooperative Drone Transportation: Automated Layout Design and Control |
|
| Bosio, Carlo | University of California, Berkeley |
| Mueller, Mark Wilfried | University of California, Berkeley |
Keywords: Aerial Systems: Applications, Intelligent Transportation Systems, Cooperating Robots
Abstract: We present a novel approach to cooperative aerial transportation through a team of drones, using optimal control theory and a hierarchical control strategy. We assume the drones are connected to the payload through rigid attachments, essentially transforming the whole system into a larger flying object with ``thrust modules" at the locations of the drones. The optimization of placement locations is achieved through an optimization problem formulation using an H2 control-based cost function. The control is then executed hierarchically, with individual commands sent to each drone based on the desired control inputs.
|
| |
| 10:00-11:30, Paper MoAIP-20.5 | Add to My Program |
| Management of Raw Material Yard in Steelworks Using Drones |
|
| Lim, Seungho | POSCO |
| Choi, Jayoung | POSCO |
| Kim, Hyun Hee | POSCO |
Keywords: Aerial Systems: Applications, Object Detection, Segmentation and Categorization
Abstract: Today, POSCO is meeting the expectations of the people who have higher standards regarding quality of life; moreover, it is taking the initiative to respond to environmental issues, such as particulate pollutants. One of them, the coal yards will be enclosed primarily and ore yards will be sealed on a phased-in basis. Until the raw material yard is completely sealed, tarpaulin covering will prevent fugitive dust and pollutants, and in cases where it is difficult to cover it, we are trying to manage it in an environmentally friendly way by spraying surface hardener. In response, a drone solution is being developed to monitor the status of yard cover, overstack, and surface hardener application to enhance implementation levels. Raw material yard in POSCO Gwang-yang steelworks (more than 160-hectares) is regularly aerially photographed by VTOL fixed-wing drones, and inventory measurements are conducted using automated geospatial information processing technology [1]. We extended this drone solution to create the environment management data by increasing the 3D reconstruction precision and introducing sematic segmentation of orthomosaics from photogrammetry. One of the major challenges is that it is difficult to classify variety of stockpiles from a lot of facilities around the raw material stockpile. The booms of reclaimer and stacker that excavate or stack raw material pass over the top of the stockpile and it makes occlusion shadows. Digital Terrain Model (DTM) is used to classify stockpile and extra equipment by considering angle of response. In addition, the ground which is sloped according to the drainage design, was compensated for by interpolating in longitudinal and transversal direction of yard and estimate the maximum height of stockpile with a precision of 99.2%. Another challenge is to separate the tarpaulin-covered areas or areas where the surface hardeners were sprayed from whole stockpile area. The similarities between the color and texture of the ore/coal and those of the tarpaulin cover made it difficult for traditional color space-based, edge detection-based, and clustering algorithms to produce reliable segmentation results. Therefore, we applied a deep learning algorithm with PSPNet [2] as the base network to perform semantic segmentation. To do this, we cropped the area where the raw materials were stacked in the yard to get thousands of orthomosaic images, and the actual covered area was binary labeled and used as training data. As a result, the autonomous photogrammetry and sematic segmentation process can reflect high geometric fidelity of stockpile and calculate the inventory and the maximum stack height of stockpile within an accuracy of 99.2%. In addition, it is expected to enable efficient and systematic yard management by obtaining environmental management data of the yard such as the tarpaulin-covered area, overstack area, and the area sprayed surface hardener.
|
| |
| 10:00-11:30, Paper MoAIP-20.6 | Add to My Program |
| High-Fidelity Drone Simulation with Depth Camera Noise and Improved Air Drag Force Models |
|
| Kim, Woosung | SungKyunKwan University |
| Luong, Tuan | Sungkyunkwan University |
| Ha, Yoonwoo | Sungkyunkwan University |
| Doh, Myeongyun | SUNGKYUNGWAN University |
| Medrano Yax, Juan Fernando | Sungkyunkwan University |
| Moon, Hyungpil | Sungkyunkwan University |
Keywords: Aerial Systems: Applications, RGB-D Perception, Dynamics
Abstract: Drone simulations offer a safe environment for collecting data and testing algorithms. However, the depth camera sensor in the simulation provides exact depth values without error, which can lead to differences in the behavior of algorithms such as SLAM when used in actual environments. The aerodynamic model in the simulation also differs from reality, leading to larger errors in drag force at high speeds. This discrepancy between the simulation and real-world conditions makes it challenging to apply high-speed drone algorithms developed in the simulation to actual environments. In this paper, we propose a more realistic simulation by implementing a depth camera noise model and an improved aerodynamic drag force model by using a rotor drag force model. Through experimental validation, we demonstrate that our proposed models are suitable for simulating the real depth camera and air drag force. Our depth camera noise model can replicate the values of a real depth camera sensor with a coefficient of determination R^2 value of 0.62, and our air drag force model improves accuracy by 51% compared to the Airsim simulation air drag force model in outdoor flying experiments at 10m/s.
|
| |
| 10:00-11:30, Paper MoAIP-20.7 | Add to My Program |
| Autonomous Visual-Based Drone Landing with Adaptive Particle Swarm Optimization and Reinforcement Learning Velocity Controllers |
|
| Wu, Li-Fan | Purdue University |
| Wang, Zihan | Purdue University |
| Rastgaar, Mo | Purdue University |
| Mahmoudian, Nina | Purdue University |
Keywords: Aerial Systems: Perception and Autonomy, Machine Learning for Robot Control
Abstract: Precise landing of Unmanned Aerial Vehicles (UAVs) onto moving platforms like Autonomous Surface Vehicles (ASVs) in GPS-denied environments is both important and challenging for collaborative navigation tasks. UAV needs to land within a confined space onboard ASV to get recharged, while ASV is subject to translational and rotational disturbances due to wind and water flow. Existing solutions either use high-level waypoint navigation that does not adapt well to robust landing on dynamic targets or require manual tuning of controller parameters or need costly sensors for target localization. To enable precise UAV landing on dynamic platforms, this paper presents a visual-based PID velocity controller trained with Particle Swarm Optimization (PSO) algorithm and Q-learning (RL) in simulation without parameter tuning before field test. The PSO initializes the Q-table and speeds up the training process of Q-learning. For experiments, a custom-made UAV with a low-cost RGB camera and a distance sensor was utilized together with the dual fiducial marker design that enabled low-altitude target tracking. The experiments in both simulation and field validate the high accuracy and stability of the PSO-trained controller. The proposed approach is not dependent on the sensor and the precise landing can be achieved in a wide range of applications.
|
| |
| 10:00-11:30, Paper MoAIP-20.8 | Add to My Program |
| Dynamic Communication for Flexible and Resilient Robotic Systems |
|
| Figat, Maksym | Warsaw University of Technology |
Keywords: Agent-Based Systems, Behavior-Based Systems, Control Architectures and Programming
Abstract: This article presents a dynamic communication model for fault-tolerant robotic systems that improves flexibility and responsiveness. It introduces conditional communication channels and adaptive communication modes to handle failures, prevent deadlocks and desynchronisation, and improve system reliability and performance. Our research contributes to the advancement of resilient and intelligent robotic systems.
|
| |
| 10:00-11:30, Paper MoAIP-20.9 | Add to My Program |
| An Innovative Victim Search Approach by Subtracting Pix2Pix-Based Pseudo Propeller Sound-Image from UAV-Mounted Microphone Captured Sound-Image |
|
| Furusawa, Tomoki | Shibaura Institute of Technology |
| Premachandra, Chinthaka | Shibaura Institute of Technology |
Keywords: AI-Based Methods, Human Detection and Tracking, Search and Rescue Robots
Abstract: 無人航空機(UAV)の活用は、自然災害による被害の調査、さらに重要な捜索救助活動において注目されています。被害者捜索への既存のアプローチは、UAVに取り付けられたカメラによってキャプチャされた画像の分析によるものです。ただし、この方法では、カメラに捕らえられていない被害者を検索することはできません。したがって、この研究では音声ベースの検索が考慮されます。正確には、UAVの車載スピーカーから被災地に音声メッセージを送信し、車載マイクで被災者の反応音をキャプチャすることで、被災者の存在を確認することができます。しかし、UAVのマイクは被災者の声とUAVの声の両方を捉えるため、全体の音声スペクトログラムから被災者の声を抽出することは重要な課題です。この問題を克服するために、UAVに設置された複数のマイクを使用して音源の位置を測定するか、ニューラルネットワーク(NN)モデルでプロペラ音を減
|
| |
| 10:00-11:30, Paper MoAIP-20.10 | Add to My Program |
| Enabling a Robot to Know Where It Is on Campus |
|
| Keys, Zoie | Hendrix College |
Keywords: Autonomous Vehicle Navigation
Abstract: The goal of this research project is to see how the Create3 robot can be used in tandem with a GPS sensor to determine the robot�s location on my college campus. I wish to be able to have the robot correctly identify key locations on campus, paved paths to different areas of campus, and different classrooms within an academic building. Then, the robot should be able to be given one location on campus and travel to that location as best as it can.
|
| |
| 10:00-11:30, Paper MoAIP-20.11 | Add to My Program |
| Generating Optimized and Smooth Path for Two-Body Vehicle Reverse Motion: Initial Guess for NMPC Solver |
|
| Ghanbarpour, Alireza | University of California at Berkeley |
| Hirao, Motohiro | University of California, Berkeley |
| Tomizuka, Masayoshi | University of California |
| Ghaemi Osgouie, Kambiz | University of Tehran, College of Engineering, Caspian Faculty Of |
|
|
| |
| 10:00-11:30, Paper MoAIP-20.12 | Add to My Program |
| Improving the Performance of Learned Controllers in Behavior Trees Using Value Function Estimates at Switching Boundaries |
|
| Karta�ev, Mart | KTH Royal Institute of Technology |
| Ogren, Petter | Royal Institute of Technology (KTH) |
Keywords: Behavior-Based Systems, Control Architectures and Programming, Reinforcement Learning
Abstract: Behavior trees represent a modular way to create an overall controller from a set of sub-controllers solving different sub-problems. These sub-controllers can be created using various methods, such as classical model based control or reinforcement learning (RL). If each sub-controller satisfies the preconditions of the next sub-controller, the controller will achieve the overall goal. However, even if all sub-controllers are locally optimal in achieving the preconditions of the next, with respect to some performance metric such as completion time, the overall controller might be far from optimal with respect to the same performance metric. In this paper, we show how the performance of the overall controller can be improved if we use approximations of value functions to inform the design of a sub-controller of the needs of the next one. We also show how, under certain assumptions, this leads to a globally optimal controller when the process is executed on all sub-controllers. Finally, this result also holds when some of the sub-controllers are already given. Regardless of if these controllers are manually designed or learned, if we are constrained to the use of some existing sub-controllers the overall controller will be globally optimal given such a constraint.
|
| |
| 10:00-11:30, Paper MoAIP-20.13 | Add to My Program |
| Investigating the Impact of Spinal Joint Dynamics on a Sprawling Robot Using Deep Reinforcement Learning |
|
| Kurkutlu, Omer | University of Notre Dame |
| Ozkan-Aydin, Yasemin | University of Notre Dame |
Keywords: Bioinspired Robot Learning, Machine Learning for Robot Control, Actuation and Joint Mechanisms
Abstract: In recent years, there has been a growing interest in developing adaptable and versatile terrestrial robotic systems capable of navigating diverse terrains. One promising approach involves integrating body undulation into sprawling robots, enabling the legs to execute lateral sweeping motions in relation to the body [1]. Prior work on sprawling robots has shown that the coordinated undulation of the robot�s body assists in achieving efficient and adaptive movement on level granular media [2] and flat surfaces [3] with improved stability and maneuverability. However, the impact of body undulation on sprawling quadruped robots while navigating diverse challenging terrains has not been extensively explored. Here, we aim to investigate and analyze the effects of body undulation on a sprawling quadruped robot�s locomotion performance across various terrains utilizing deep reinforcement learning algorithms by incorporating the dynamics of spinal joints into the learning process. To achieve this, we utilize both simulated and physical robotic systems. The physical robot consists of four 2-degree-of-freedom (DoF) limbs and two body parts connected via a servo motor (Fig. 1a). To facilitate the learning process, we develop a comprehensive simulation model of the sprawling robot (Fig. 1b) using the Robot Operating System (ROS) and the Gazebo simulation environment [4], which interfaces with the physical Open Dynamics Engine (ODE). The physical properties, including the dynamics of the spinal joints, are accurately represented using the Unified Robot Description Format (URDF). We formulate the leg-body coordination problem as a Markov Decision Process (MDP), represented by a tuple (S, A, T, R, γ), where S and A denote the state space and action space of the robot, respectively. At each time step, the robot selects an action from A, resulting in transitions of the legs and body from the current state to a new state with a certain transition probability T. The Deep Reinforcement Learning (DRL) model then aims to maximize the cumulative reward R obtained during the interactions between the robot and its environment. The learning objective is for the robot to effectively navigate from a starting point to a goal without falling, relying solely on proprioceptive measurements from joint encoders. We train the controller with two different configurations: one with rigid spinal joints and another with flexible spinal joints. Through extensive experimentation and analysis, we assess the impact of various spinal joint parameters on the robot�s performance metrics, such as speed, stability, and energy efficiency. Moreover, we explore different reinforcement learning techniques, including Proximal Policy Optimization (PPO) and policy gradients, to train the control policies of the robot in an end-to-end manner.
|
| |
| 10:00-11:30, Paper MoAIP-20.14 | Add to My Program |
| A Bioinspired Modular Linear Actuator Architecture for Robotics |
|
| Ruddy, Bryan P. | University of Auckland |
Keywords: Biologically-Inspired Robots, Actuation and Joint Mechanisms, Cellular and Modular Robots
Abstract: We present a bioinspired design architecture for a modular linear permanent magnet actuator for wearable and mobile robots. The architecture organizes a set of identical coil and magnet units into a hierarchical structure that provides local control, energy, and cooling. Work is underway to fabricate the components of an example motor.
|
| |
| 10:00-11:30, Paper MoAIP-20.15 | Add to My Program |
| Bio-Inspired Hummingbird Robot Flying through Obstacles and Wind Gusts |
|
| Zhou, Yiming | Purdue University |
| Tu, Zhan | Beihang University |
| Fei, Fan | Purdue University |
| Xiao, Rudi | Dexter Southfield |
| Deng, Xinyan | Purdue University |
Keywords: Biologically-Inspired Robots, Biomimetics
Abstract: Flying animals have demonstrated remarkable adaptability and recovery capabilities in the face of adverse conditions throughout their lifetimes. In our previous work, we have successfully demonstrated that a bio-inspired hummingbird robot can maintain flight stability even with a considerable loss of wing area. In this work, we further investigate the robot�s resilience in two additional challenges: collision in cluttered spaces and wind gust disturbances. These scenarios are frequently encountered by birds and insects, and insight to the underlying mechanisms animals employ can serve as a design principle for the development of next-generation bio-inspired robots capable of navigating complex environments. For the first scenario, we tested point-to-point tracking of the hummingbird robot while navigating through a cluttered path obstructed by random obstacles. PVC pipes with varying diameters were randomly placed as barriers along the flight trajectory. The number of pipes vary from two to six. When encountering obstacles, the wings of the robot exhibited a passive rebounding behavior. The robot effectively navigated through the spatial gaps between barriers after multiple rebounds. The flexible wings, mounted on the reciprocal thorax joint with torsional spring, were able to bump through obstacles without crashing the vehicle or breaking the wings. For the second scenario, a horizontal wind tunnel was constructed to generate wind gust disturbance to the hovering hummingbird robot. A honeycomb configuration was incorporated into the outflow section of the wind tunnel to ensure uniform flow. The robot was commanded to hover at a set position 50cm away from the wind tunnel. Both frontal wind gusts ranging from 1.5 to 3.3 m/s and lateral gusts ranging from 1.5 to 2.5 m/s were induced. The results demonstrated the robot's ability to adjust its posture and movement in order to maintain position and attitude stabilization. Under frontal gust, body pitch angle tilted proportionally to the wind gust speed. Given the same wind velocity, lateral wind gusts caused higher perturbation of the vehicle, compared to the frontal wind gust.
|
| |
| 10:00-11:30, Paper MoAIP-20.16 | Add to My Program |
| Mechanical Intelligence in Undulatory Locomotors |
|
| Wang, Tianyu | Georgia Institute of Technology |
| Pierce, Christopher | Georgia Institute of Technology |
| Kojouharov, Velin | Georgia Institute of Technology |
| Chong, Baxi | Georgia Institute of Technology |
| Diaz, Kelimar | Georgia Institute of Technology |
| Lu, Hang | Georgia Institute of Technology |
| Goldman, Daniel | Georgia Institute of Technology |
Keywords: Biologically-Inspired Robots, Biomimetics, Redundant Robots
Abstract: In the study of biological limbless locomotion, the role of "mechanical intelligence" -- passive processes controlled by physical properties -- is often overlooked, which limits the effectiveness of robotic models aiming to replicate these creatures' locomotion performance. This work demonstrates the significance of mechanical intelligence in limbless locomotion in complex terrain, using a comparative study of a nematode worm, Caenorhabditis elegans, and a robot developed to resemble the bilateral actuation mechanism in limbless organisms. Through experiments in laboratory models of complex environments, We found that the robot effectively models nematodes' kinematics and locomotion performance with open-loop control, suggesting that mechanical intelligence reduces the requirement for active sensing and feedback during obstacle navigation. Moreover, we demonstrated that mechanical intelligence facilitates effective open-loop robotic locomotion in diverse indoor and outdoor environments. This research not only presents the general principles of mechanical intelligence in terrestrial limbless locomotion across biological and robotic systems, but also offers a novel design and control paradigm for limbless robots for applications such as search-and-rescue operations and extraterrestrial explorations.
|
| |
| 10:00-11:30, Paper MoAIP-20.17 | Add to My Program |
| Exploring the Sea Turtle Locomotion Mechanics for Biomimetic Robotic Design |
|
| Chikere, Nnamdi | University of Notre Dame |
| McElroy, John | University College Dublin |
| Ozkan-Aydin, Yasemin | University of Notre Dame |
Keywords: Biologically-Inspired Robots, Biomimetics, Soft Robot Materials and Design
Abstract: Despite significant advancements in robotic locomotion, developing robots capable of traversing complex environments remain challenging. However, inspiration can be drawn from certain animals that exhibit locomotion in terrestrial and aquatic environments, leading to the development of robotic systems with versatile applications in underwater exploration, environmental monitoring, and search and rescue operations. Sea turtles, in particular, demonstrate unique locomotion mechanics as they navigate diverse environments, making them a source of inspiration for advancing bioinspired robots. Prior research has explored several aspects of locomotory gait development in robotic turtles. In this study, we explore how the flipper morphology and gait patterns affect the terrain adaptability of a quadruped robot inspired by sea turtles. We developed a simplified robotic system inspired by sea turtles. The robot comprises a 12.5 cm long oval-shaped body frame, designed using Solidworks and 3D printed with a Stratasys F170 printer using ABS. The body houses a battery, control unit, and sensor arrays. It features four independently actuated flippers, larger fore-flippers, and smaller hind flippers. A comprehensive series of tests were performed on the sea turtle-inspired robot traversing across varied terrains, such as sandy landscapes, hard surfaces, stairs, and rocky paths. A key aspect of our investigation was the exploration of flipper morphology, focusing on the flipper flexibility characteristics that enable sea turtles to navigate efficiently across a wide range of environments. Additionally, we explored the locomotory gaits of sea turtles. Sea turtles, particularly the hatchlings, exhibit a remarkable range of locomotion strategies, dynamically adapting their gait patterns in response to environmental variations. By integrating these locomotory gaits into our robot, we aim to enhance its overall performance and versatility. The interplay between flipper morphology and locomotory gaits in response to varied environmental conditions was another focal point of our study. Our experimental results showed that the diagonal gait was most adaptable to a variety of terrains as it could climb steps, exhibited a traversal speed of 0.68 � 0.1 body length (BL)/cycle on rocks, and a higher speed of 0.82 � 0.3 BL/cycle on damp, bumpy sand. The all-together gait achieved a peak speed of 0.45 � 0.4 BL/cycle on damp, bumpy sand but dropped by 40% and 80% on rocks and upsteps, respectively. Rigid flippers boosted the traversal speed of both gaits by 30% on sand and steps but increased obstacle encounters on rocks by 20% leading to a mean path deviation of 45 degrees. The outcomes of our experiments and analyses will not only provide valuable insights into both the practical implementation of biomimicry in robotic systems but will also improve our understanding of sea turtle locomotion in challenging environments.
|
| |
| 10:00-11:30, Paper MoAIP-20.18 | Add to My Program |
| Construction and Preliminary Performance Evaluation of Polychaete-Inspired Robot |
|
| Pham, Huy | Case Western Reserve University |
| Norville, Malyka | Howard University |
| Tyszka, Benjamin | Case Western Reserve University |
| Neel, Alex | Case Western Reserve University |
| Lee, Christian | Case Western Reserve University |
| Daltorio, Kathryn A | Case Western Reserve University |
Keywords: Biomimetics, Biologically-Inspired Robots, Soft Robot Materials and Design
Abstract: The diversity of worm-like animals has inspired robots that can better traverse spaces inaccessible or difficult for humans. Here, we are developing a robot inspired by polychaete worms, which locomote with undulation and peristalsis. Our robot, called "Polysectoid", is constructed from long ribbons and can perform different modes of locomotion, allowing it to explore various environments and reduce the need for different robots.
|
| |
| 10:00-11:30, Paper MoAIP-20.19 | Add to My Program |
| Underactuated Gaits in a Bioinspired Swimming Robot with a Bistable Tail |
|
| Chivkula, Prashanth | Clemson University |
| Rodwell, Colin | Clemson University |
| Tallapragada, Phanindra | Clemson University |
Keywords: Biomimetics, Marine Robotics, Underactuated Robots
Abstract: Fish outperform current underwater robots in speed, efficiency of locomotion, and agility, in part due to their flexible appendages that are capable of rich combinations of modes of motion. In fish-like robots, actuating many different modes of oscillation of tails or fins can become a challenge. This paper presents a highly underactuated (with a single actuator) fish-like robot with a bistable tail with a double-well elastic potential. Oscillations of such a tail depend on frequency and amplitude of excitation, and tuning the frequency-amplitude can produce controllable oscillations in different modes. This robot design is inspired by recent work on underactuated flexible swimming robots driven by a single rotor. The oscillations of the rotor can propel and steer the robot, but saturation of the rotor makes performing long turns challenging. This paper demonstrates that by adding geometric bistability to the flexible tail, turns can be performed by controllably exciting single-well oscillations in the tail, while exciting double-well oscillations allows straight-line motion.
|
| |
| 10:00-11:30, Paper MoAIP-20.20 | Add to My Program |
| Balancing Memorization and Generalization in RNNs for High Performance Brain-Machine Interfaces |
|
| Costello, Joseph | University of Michigan |
| Temmar, Hisham | University of Michigan |
| Mender, Matthew J. | University of Michigan |
| Cubillos, Luis H. | University of Michigan |
| Wallace, Dylan Michael | University of Michigan |
| Willsey, Matthew S. | University of Michigan |
| Patil, Parag G. | University of Michigan |
| Chestek, Cynthia | University of Michigan |
Keywords: Brain-Machine Interfaces, Physically Assistive Devices, Prosthetics and Exoskeletons
Abstract: Intracortical brain-machine interfaces (BMIs) offer a solution to restore motor function to people with paralysis. BMIs record and filter noisy neural signals from implanted electrodes, predict user intentions using a decoding algorithm, and send control commands to outputs such as a robotic prosthesis or muscle stimulator. While the accuracy of current real-time decoding algorithms is limited, recurrent neural networks (RNNs) using modern techniques have shown promise in more accurately predicting movements from neural signals. However, they have yet to be rigorously evaluated against other decoding algorithms in a closed-loop setting where they may be unusable due to overfitting. Additionally, RNNs can memorize and generate movement patterns, yet it remains unclear if memorization can lead to better closed-loop performance. Here we tested RNN decoders against other neural network architectures in a real-time BMI controlled by a non-human primate. One rhesus macaque was implanted with 96-channel Utah arrays in motor cortex and trained to perform a target-acquisition task requiring simultaneous movements of one or two finger groups. Spiking-band power was recorded from each channel, averaged into 32 ms bins, and fed into a decoder to predict finger velocity and position. After initial hand-control trials, we trained and evaluated five decoders in online trials: two RNN architectures (LSTM and GRU), a convolutional feedforward network, a transformer network, and a linear Kalman filter. In additional tests, we trained and evaluated LSTM decoders on tasks with reduced numbers of targets and distinct movements. Across one and two finger online tasks, LSTMs outperformed convolutional and transformer-based neural networks, averaging 18% higher throughput than the convolution network across three comparison days. On simplified tasks with a reduced movement set, RNN decoders were allowed to memorize movement patterns and matched able-bodied control for up to four target postures. Simply training on fewer movement resulted in �memorization� of those movements. Performance gradually dropped as the number of distinct movements increased but did not fall below fully continuous decoder performance. Simulated datasets suggest that increasing task complexity requires more training data or input channels to maintain decode accuracy. Finally, in a two-finger task where one degree-of-freedom had poor input signals, we recovered functional control using RNNs trained on a reduced movement set for the finger with poor input signals. This allowed the RNN to act both like a movement classifier and continuous decoder. Our results suggest that RNNs can enable functional real-time BMI control by learning and generating accurate movement patterns. Further work may explore modifying internal decoder dynamics to encourage weaker or stronger dynamics, to more accurately produce stereotyped movements.
|
| |
| 10:00-11:30, Paper MoAIP-20.21 | Add to My Program |
| A Deep-Learning-Augmented Kalman Filter for Brain-Machine Interfaces |
|
| Cubillos, Luis H. | University of Michigan |
| Revach, Guy | ETH Z�rich |
| Costello, Joseph | University of Michigan |
| Temmar, Hisham | University of Michigan |
| Mender, Matthew J. | University of Michigan |
| Ni, Xiaoyong | ETH Z�rich |
| Kelberman, Madison | University of Michigan |
| Wallace, Dylan Michael | University of Michigan |
| Willsey, Matthew S. | University of Michigan |
| van Sloun, Ruud J.G. | Eindhoven University of Technology |
| Shlezinger, Nir | Ben-Gurion University of the Negev |
| Patil, Parag G. | University of Michigan |
| Chestek, Cynthia | University of Michigan |
Keywords: Brain-Machine Interfaces, Prosthetics and Exoskeletons, Rehabilitation Robotics
Abstract: INTRODUCTION = People with spinal cord injuries often need to rely on others for basic tasks, which limits their independence. A potential solution to this issue lies in brain-machine interfaces (BMIs), which enable patients to interact with external devices, such as a computer or a robotic arm. BMIs work by reading noisy neural activity using microelectrodes embedded in the brain cortex. Then, linear or non-linear decoding algorithms, take the neural activity as input in real time and predict control signals, which can be used to direct external devices. Traditional linear decoder algorithms, such as the Kalman filter (KF), have been used widely in robotic applications. Linear decoders are considered reliable due to their explainability, as understanding the relationship between input and output promotes safety by preventing unexpected behaviors. However, linear decoders struggle to model the likely non-linear relationship between neural activity and movement. In contrast, deep learning algorithms show promising results but raise safety concerns due to their 'black-box' nature, limiting their use in controlling physical devices. Recently, KalmanNet was proposed as a way of combining the advantages of deep learning and the explainability of the KF by using a neural network to compute the Kalman gain. In this study, we adapted and applied KalmanNet for BMI applications. METHODS A rhesus macaque with microelectrode arrays implanted in the primary motor cortex of the brain was trained in a dexterous finger task, while finger kinematics and neural activity were recorded. Both offline (pre-recorded data, n = 3 days) and online (closed-loop control, n = 2 days) trials were conducted to compare KalmanNet's performance against the traditional KF. The parameters for both models were trained daily with a single 500-trial calibration run. RESULTS In offline trials, KalmanNet achieved a significantly higher correlation with the actual velocities (112% increase, p<1E-4), indicating a more accurate movement prediction. During the online trials, KalmanNet increased throughput (28% increase, p<1E-7), reduced the orbiting time (71% reduction, p<1E-20), and shortened the time to target (19% reduction, p<1E-4), compared to KF, all indicative of improved performance. DISCUSSION KalmanNet�s Kalman gain, which dictates how to balance the trust between the previous prediction and the current measurement, appeared to vary with the output velocities, going high for higher velocities and low for lower velocities. Thus, KalmanNet appears to act as a non-linear trust system that modulates the Kalman gain to trust the neural activity for higher velocities or the evolution model for lower velocities. These findings suggest that KalmanNet may offer significant performance improvements over the KF while maintaining most of its explainable structure and without increasing the need for data, making it an attractive alternative to existing decoding algorithms.
|
| |
| 10:00-11:30, Paper MoAIP-20.22 | Add to My Program |
| Towards Collision Avoidance for UAVs to Guide the Visually Impaired |
|
| Raj, Suman | Indian Institute of Science, Bengaluru |
| Padhi, Swapnil | Indian Institute of Science, Bangalore |
| Bhoot, Ruchi | Indian Institute of Science, Bangalore |
| Modi, Prince | Indian Institute of Science, Bangalore |
| Simmhan, Yogesh | Indian Institute of Science |
Keywords: Collision Avoidance, Deep Learning for Visual Perception, Autonomous Vehicle Navigation
Abstract: Autonomous navigation by drones using onboard sensors combined with machine learning and computer vision algorithms benefit domains like agriculture, logistics and disaster management. Here, we examine the use of drones to assist Visually Impaired People (VIPs) navigate through outdoor urban environments. Specifically, we introduce a perception-based path planning system to offer guidance to the VIPs about their immediate vicinity, as they walk through city streets. We represent the problem using a geometric formulation and propose a multi-DNN-based framework for local obstacle avoidance, both for the VIP and for the UAV that is following them. We present preliminary results from evaluations conducted on a drone-human system in a university campus environment.
|
| |
| 10:00-11:30, Paper MoAIP-20.23 | Add to My Program |
| Intention Aware Reinforcement Learning for Robot Crowd Navigation |
|
| Liu, Shuijing | University of Illinois at Urbana Champaign |
| Chang, Peixin | University of Illinois at Urbana Champaign |
| Huang, Zhe | University of Illinois at Urbana-Champaign |
| Chakraborty, Neeloy | University of Illinois at Urbana-Champaign |
| Hong, Kaiwen | University of Illinois at Urbana Champaign |
| McPherson, D. Livingston | University of Illinois |
| Geng, Junyi | Pennsylvania State University |
| Driggs-Campbell, Katherine | University of Illinois at Urbana-Champaign |
Keywords: Collision Avoidance, Human-Centered Robotics, Reinforcement Learning
Abstract: We study the problem of safe and intention-aware robot navigation in dense and interactive crowds. Most previous reinforcement learning (RL) based methods ignore the intentions of people, which results in performance degradation. To encourage longsighted robot behaviors, we infer the intentions of dynamic agents by predicting their future trajectories for several timesteps. The predictions are incorporated into a model-free RL framework to prevent the robot from intruding into the intended paths of other agents. We demonstrate that our method enables the robot to achieve good navigation performance and non-invasiveness in challenging crowd navigation scenarios. We successfully transfer the policy learned in simulation to a real-world TurtleBot 2i.
|
| |
| 10:00-11:30, Paper MoAIP-20.24 | Add to My Program |
| Femtosecond Laser Fabricated Nitinol Living Hinges for Millimeter-Sized Robots |
|
| Hedrick, Alexander | University of Colorado Boulder |
| Kabutz, Heiko Dieter | University of Colorado Boulder |
| Smith, Lawrence | University of Colorado Boulder |
| Jayaram, Kaushik | University of Colorado Boulder |
Keywords: Compliant Joints and Mechanisms, Micro/Nano Robots
Abstract: Femtosecond laser technology can be used to process nitinol while avoiding heat affected zones (HAZ), thus retaining superelastic properties. In this work, we manufacture living hinges of arbitrary cross section from nitinol using a femtosecond laser micromachine. We determined the laser cutting parameters, modeled the hinges using an existing theoretical model as well as creating an Abaqus finite element model, and illustrated the accuracy of the models by comparing them to the torque produced by eight different hinges. Finally, we manufactured two prototype devices to illustrate the usefulness of these nitinol hinges: a sample spherical 5-bar mechanism and a piezo-electric actuated robotic wing.
|
| |
| 10:00-11:30, Paper MoAIP-20.25 | Add to My Program |
| Potential for Lighter Lower-Limb Exoskeletons through Parallel Springs |
|
| Kalicak, Jack | University of Notre Dame |
| Yang, Kang | University of Notre Dame |
| Bol�var-Nieto, Edgar | University of Notre Dame |
Keywords: Compliant Joints and Mechanisms, Optimization and Optimal Control, Prosthetics and Exoskeletons
Abstract: A chief concern for creating a comfortable exoskeleton is minimizing its mass. Heavier devices increase the metabolic energy required to use the exoskeleton. When powered by a motor and gearbox, these components represent a considerable portion of the system mass. Modern electric motor technology has the potential to power light exoskeletons. For example, an electric motor with specific power 682 [Wattskg] could assist 100% of the RMS ankle power 200 [Watts] with a few hundred grams. However, the peak and continuous motor torque are not enough for direct actuation and the required transmission can increase the overall actuator mass two or three times --- resulting in a heavy exoskeleton. This work evaluates the potential to reduce mass by reducing RMS motor torque (related to continuous motor torque) for multiple activities of daily living using a parallel elastic actuator (PEA). Optimizing for multiple activities ensures the exoskeleton is beneficial in a variety of situations. Mass minimization occurs if the same torque can be generated and the parallel spring plus lighter motor weighs less than the original actuator. For very particular situations our optimization reduced enough RMS torque to enable a smaller motor selection, especially within the ankle.
|
| |
| 10:00-11:30, Paper MoAIP-20.26 | Add to My Program |
| Automation of Post-Processing of Additive Manufacturing Using Machine Vision and Collaborative Robots |
|
| Schorr, Logan | Virginia Commonwealth University |
| Lee, Joseph | Virginia Commonwealth University |
| Hadimani, Ravi | Virginia Commonwealth University |
Keywords: Computer Vision for Automation, Additive Manufacturing, Vision-Based Navigation
Abstract: Automation via robotic systems is becoming widely adapted across many industries, but intelligent autonomy in dynamic environments is challenging to implement due to lack of information. These robots, if sufficiently equipped with 3D vision, are capable of integration into more powerful roles where automation is desired (e.g. hazardous conditions) but requires the flexibility of a human agent. Despite the clear benefits of 3D vision for robotics, object detection and location is remarkably difficult, especially when compared to 2D counterparts. An accessible solution with the power of 3D vision with the ease of 2D processing provides the ability to finally remove the human from a hazardous environment, as the robot is just as flexible. This paper proposes a method that utilizes 2D image processing to simplify 3D data for robotic workspace detection. Using a time- of-flight sensor, commonly used for autonomous vehicles and mobile robots, mounted on the end of a 6DOF robotic arm, a depth image of the workspace is collected. The algorithm identifies the contour of a table to create a mask, filters extraneous data points, and converts only the relevant sections to a 3D pointcloud. This 3D pointcloud is reoriented to the robot base frame and processed to identify the precise location of the workspace, with regards to the robot. The robot then possesses the information required for operations concerning the workspace, without prior knowledge of the specific location. This method has been implemented to an accuracy within 0.75cm, with only a single scan.
|
| |
| 10:00-11:30, Paper MoAIP-20.27 | Add to My Program |
| Digital Twin Framework for Remote Maintenance of Nuclear Fusion Devices |
|
| Choi, Jungsup | SEOULTECH UNIVERSITY |
| Moon, Jeong Whan | KNR System |
| Ryew, Sung Moo | KnR Systems Inc |
| Lee, Dohee | Korea Institute of Fusion Energy |
| Kim, Hong-Tack | Korea Institute of Fusion Energy |
| Park, Young Min | Korea Institute of Fusion Energy |
| Hong, Kwon Hee | Korea Institute of Fusion Energy |
| Her, Namil | KFE |
| Kim, Beom Seok | Seoul National University of Science and Technology |
| Kim, Jinhyun | Seoul National University of Science and Technology |
Keywords: Computer Vision for Automation, Manipulation Planning, Redundant Robots
Abstract: We present a digital twin framework that leverages a hyper-redundant robot using Moveit! and Unity. This framework aims to enhance the effectiveness of remote maintenance for nuclear fusion devices, highlighting notable progress in fusion device maintenance systems.
|
| |
| 10:00-11:30, Paper MoAIP-20.28 | Add to My Program |
| Temporal CFT: Multi-Temporal Cross-Modality Fusion Transformer for Multispectral Video Object Detection |
|
| Varaganti, Srikar | University of Michigan Ann Arbor |
| Kanu-Asiegbu, Asiegbu Miracle | University of Michigan |
| Du, Xiaoxiao | University of Michigan |
Keywords: Computer Vision for Automation, Multi-Modal Perception for HRI, Object Detection, Segmentation and Categorization
Abstract: Two challenges exist for the object detection task. First, RGB cameras can suffer from bad visibility at night and low-light conditions, while thermal infrared cameras are comparatively robust to illumination changes and shadow effects. Second, existing fusion methods for RGB and thermal data typically only work with image-level inputs without considering temporal relationships between video frame sequences. To address these challenges, we propose Temporal CFT, a multi-temporal cross-modality fusion transformer for multispectral (RGB and thermal) video object detection. We show that our proposed temporal CFT model achieves significantly higher true detection rate and improved person detection performance compared with the original CFT model with single-frame inputs.
|
| |
| 10:00-11:30, Paper MoAIP-20.29 | Add to My Program |
| Supporting Computer-Vision Tasks with Small Unmanned Aerial Systems through Autonomous Vision-Supported Maneuvers |
|
| Chowdhury, Muhammed Tawfiq | University of Notre Dame |
| Rashid, Md Tahmid | University of Notre Dame |
| Cleland-Huang, Jane | University of Notre Dame |
Keywords: Computer Vision for Automation, Vision-Based Navigation, Aerial Systems: Applications
Abstract: Given their agility and mobility, sensor-equipped small Unmanned Aerial Systems (sUAS) have evolved into pragmatic tools for critical applications such as search-and-rescue and damage assessment. Despite their virtues, various visual limitations, such as less-than-ideal orientations and perspectives, varying degrees of occlusion, atmospheric disturbances, and poor lighting conditions restrict the potency of sUAS in accurately identifying objects of interest through their cameras and onboard object recognition algorithms. In this poster, we present the concept of a motion planning system (MPS) that leverages positional knowledge and feed from cameras to maneuver a drone to an ideal position for better object recognition. We propose a real-time attribute analysis model for determining the performance of object recognition algorithms and use the models for formulating an ideal motion plan. The MPS is intended to make real-time decisions based on camera feedback and object recognition algorithm outputs.
|
| |
| 10:00-11:30, Paper MoAIP-20.30 | Add to My Program |
| Towards a Universal Calibration Framework for Mixed-Reality Assisted Robotic Surgery |
|
| Madani, Sepehr | McGill University |
| Sayadi, Amir | McGill Universiity |
| Turcotte, Robert | McGill University |
| Cecere, Renzo | McGill University |
| Aoude, Ahmed | McGill University |
| Hooshiar, Amir | McGill University |
Keywords: Computer Vision for Medical Robotics, Medical Robots and Systems, Surgical Robotics: Planning
Abstract: This study presents a method to improve the spatial localization accuracy of Mixed Reality (MR) systems in surgical applications. Fiducial tracking, although computationally efficient, lacks robust submillimetre accuracy. The study thus leverages the robustness of surgical navigation systems (NAV) to achieve accurate hologram anatomy registration in MR systems. The proposed method aims to localize the MR device's camera and the 3D hologram in a scenario suitable for use in surgical applications. The method was tested on four MR camera trajectories and showed superior results to an object rendered using a fiducial-based technique. Positional errors were as low as 0.2mm in spatial trajectories with a standard error of 0.2 mm, indicating the robustness of the method. The proposed technique could pave the way for deploying MR-assisted surgeries with submillimetre accuracy and improved spatial localization. The method also suited off-the-shelf Head Mounted Displays (HMDs) like Hololens 2 without native software development, suggesting its potential universal application.
|
| |
| 10:00-11:30, Paper MoAIP-20.31 | Add to My Program |
| Object Detection Using Multi-2D LiDAR for Drydock Block Operation |
|
| Kim, Myeongjin | Seoul National University of Science and Technology |
| Kim, Jinhyun | Seoul National University of Science and Technology |
Keywords: Computer Vision for Transportation
Abstract: Since the installation of drydock blocks has traditionally been a manual process using cranes, it has been associated with low work efficiency and a high risk of accidents. To address this, we propose automating the block installation step by estimating the location of drydock blocks on a robot platform using multiple 2D LiDAR sensors. We plan to utilize ROS (Robot Operating System) and incorporate data from each sensor node using the PCL (Point Cloud Library) and OpenCV libraries. As a result of the Coppeliasim simulation, the accuracy of the center point of the block is located within 10 cm of each direction.
|
| |
| 10:00-11:30, Paper MoAIP-20.32 | Add to My Program |
| Improving Explainable Object-Induced Model through Uncertainty for Automated Vehicles |
|
| Ling, Shihong | University of Pittsburgh |
| Wan, Yue | University of Pittsburgh |
| Jia, Xiaowei | University of Pittsburgh |
| Du, Na | University of Pittsburgh |
Keywords: Computer Vision for Transportation, Human-Centered Automation, AI-Based Methods
Abstract: To help improve system transparency in autonomous vehicles (AVs) and address �black box� challenges, our study presents a novel approach aimed at enhancing explanation generation in AVs. Our work leveraged the inherent uncertainty information in AV actions and proposed a new uncertainty-based reweighting strategy to improve performance of the object-induced model and enhance interpretability on identifying challenging scenarios. Using the BDD-OIA dataset, we initially utilized the BDD-OIA deep learning architecture, developed separate network layers for each possible action, and connected the explanation outputs to their corresponding actions. To explicitly account for the model uncertainty and data uncertainty, our model utilized an evidential deep learning method that optimizes based on output distribution using a Dirichlet prior. Furthermore, we periodically adjusted the weights of the training data based on the calculated model and data uncertainty. Evaluation based on accuracy and Area Under the ROC Curve (AUC) scores clearly indicated that our architecture outperformed the original method, providing evidence that leveraging uncertainty can effectively enhance model performance. In future work, our focus will be on developing context and user-adaptive explanations through the implementation of a human-in-the-loop learning paradigm.
|
| |
| 10:00-11:30, Paper MoAIP-20.33 | Add to My Program |
| Fuzzy Visual Obstacle Avoidance Using OpenCV and iRobot Create3 |
|
| Ferrer, Gabriel | Hendrix College |
Keywords: Vision-Based Navigation, Education Robotics, Neural and Fuzzy Control
Abstract: We are in the process of adopting iRobot Create3 robots for our undergraduate robotics course. This work specifically addresses teaching Computer Vision as a sensing technique. We have developed a fuzzy-logic-based reactive controller that uses OpenCV's contour detector to identify the boundary between the ground and objects resting upon it, assuming a flat ground area of uniform appearance The robot drives towards the highest open space in the image. So far, the robot drives in very smooth paths, finding an area of maximum space. Unfortunately, when caught in a corner, it struggles to escape. Future work ideas include fuzzifying linear motion and integrating the IR and bump sensors to help the robot escape corners.
|
| |
| MoPL Plenary session, Hall D |
Add to My Program |
| Plenary 1 - Marcie O�Malley |
|
| |
| Chair: Gregg, Robert D. | University of Michigan |
| |
| 11:30-12:30, Paper MoPL.1 | Add to My Program |
| Robots That Teach and Learn with a Human Touch |
|
| OMalley, Marcia | Rice University |
Keywords: Haptics and Haptic Interfaces
Abstract: Marcia O�Malley is the Thomas Michael Panos Family
Professor in Mechanical Engineering, Computer Science, and
Electrical and Computer Engineering at Rice University,
where she is currently serving as Chair of the Department
of Mechanical Engineering. She received her BS in
Mechanical Engineering from Purdue University, and her MS
and PhD in Mechanical Engineering from Vanderbilt
University. Her research is in the areas of haptics and
robotic rehabilitation, with a focus on the design and
control of wearable robotic devices for training and
rehabilitation. O�Malley was a recipient of both the ONR
Young Investigator award and the NSF CAREER Award. More
recently, she has been recognized with Rice�s Presidential
Award for Mentoring, the Graduate Student Association
Faculty Teaching and Mentoring Award, and the Rice
University Faculty Award for Excellence in Research,
Teaching, and Service. Her group has received Best Paper
Awards in the IEEE Transactions on Haptics and the
IEEE/ASME Transactions on Mechatronics. She is a Fellow of
the American Society of Mechanical Engineers, the Institute
of Electrical and Electronics Engineers, and the American
Institute for Medical and Biological Engineering. She
currently serves the Editor-in-Chief of the IEEE
International Conference on Robotics and Automation
Conference Editorial Board.
|
| |
| MoBT1 Regular session, 140A |
Add to My Program |
| Task Planning |
|
| |
| Chair: Vanderborght, Bram | Vrije Universiteit Brussel |
| Co-Chair: Alami, Rachid | CNRS |
| |
| 14:00-14:06, Paper MoBT1.1 | Add to My Program |
| Learning a Causal Transition Model for Object Cutting |
|
| Zhang, Zeyu | Beijing Institute for General Artificial Intelligence |
| Han, Muzhi | University of California, Los Angeles |
| Jia, Baoxiong | Beijing Institute for General Artificial Intelligence |
| Jiao, Ziyuan | Beijing Institute for General Artificial Intelligence |
| Zhu, Yixin | Peking University |
| Zhu, Song-Chun | UCLA |
| Liu, Hangxin | Beijing Institute for General Artificial Intelligence (BIGAI) |
Keywords: Task Planning, Simulation and Animation
Abstract: Cutting objects into desired fragments is challenging for robots due to the spatially unstructured nature of fragments and the complex one-to-many object fragmentation caused by actions. We present a novel approach to model object fragmentation using an attributed stochastic grammar. This grammar abstracts fragment states as node variables and captures causal transitions in object fragmentation through production rules. We devise a probabilistic framework to learn this grammar from human demonstrations. The planning process for object cutting involves inferring an optimal parse tree of desired fragments using the learned grammar, with parse tree productions corresponding to cutting actions. We employ Monte Carlo Tree Search (MCTS) to efficiently approximate the optimal parse tree and generate a sequence of executable cutting actions. The experiments demonstrate the efficacy in planning for object-cutting tasks, both in simulation and on a physical robot. The proposed approach outperforms several baselines by demonstrating superior generalization to novel setups, thanks to the compositionality of the grammar model.
|
| |
| 14:06-14:12, Paper MoBT1.2 | Add to My Program |
| Object Rearrangement Planning for Target Retrieval in a Confined Space with Lateral View |
|
| Kang, Minjae | Seoul National University (SNU) |
| Kim, Junseok | Seoul National University |
| Kee, Hogun | Seoul National University |
| Oh, Songhwai | Seoul National University |
Keywords: Task and Motion Planning, Manipulation Planning, Deep Learning in Grasping and Manipulation
Abstract: In this paper, we perform an object rearrangement task for target retrieval in an environment with a confined space and limited observation directions. The agent must create a collision-free path to bring out the target object by relocating the surrounding objects using the prehensile action, i.e., pick-and-place. Object rearrangement in a confined space is a non-monotone problem, and finding a valid plan within a reasonable time is challenging. We propose a novel algorithm that divides the target retrieval task, which requires a long sequence of actions, into sequential sub-problems and explores each solution through subgoal-conditioned Monte Carlo tree search (MCTS). In the experiment, we verify that the proposed algorithm can find safe rearrangement plans with various objects efficiently compared to the existing planning methods. Furthermore, we show that the proposed method can be transferred to a real robot experiment without additional training.
|
| |
| 14:12-14:18, Paper MoBT1.3 | Add to My Program |
| Learning Type-Generalized Actions for Symbolic Planning |
|
| Tanneberg, Daniel | Honda Research Institute |
| Gienger, Michael | Honda Research Institute Europe |
Keywords: Representation Learning, Task Planning
Abstract: Symbolic planning is a powerful technique to solve complex tasks that require long sequences of actions and can equip an intelligent agent with complex behavior. The downside of this approach is the necessity for suitable symbolic representations describing the state of the environment as well as the actions that can change it. Traditionally such representations are carefully hand-designed by experts for distinct problem domains, which limits their transferability to different problems and environment complexities. In this paper, we propose a novel concept to generalize symbolic actions using a given entity hierarchy and observed similar behavior. In a simulated grid-based kitchen environment, we show that type- generalized actions can be learned from few observations and generalize to novel situations. Incorporating an additional on-the-fly generalization mechanism during planning, unseen task combinations, involving longer sequences, novel entities and unexpected environment behavior, can be solved.
|
| |
| 14:18-14:24, Paper MoBT1.4 | Add to My Program |
| CAR-DESPOT: Causally-Informed Online POMDP Planning for Robots in Confounded Environments |
|
| Cannizzaro, Ricardo | Oxford Robotics Institute |
| Kunze, Lars | University of Oxford |
Keywords: Task Planning, Probabilistic Inference
Abstract: Robots operating in real-world environments must reason about possible outcomes of stochastic actions and make decisions based on partial observations of the true world state. A major challenge for making accurate and robust action predictions is the problem of confounding, which if left untreated can lead to prediction errors. The partially observable Markov decision process (POMDP) is a widely-used framework to model these stochastic and partially-observable decision-making problems. However, due to a lack of explicit causal semantics, POMDP planning methods are prone to confounding bias and thus in the presence of unobserved confounders may produce underperforming policies. This paper presents a novel causally-informed extension of "anytime regularized determinized sparse partially observable tree" (AR-DESPOT), a modern anytime online POMDP planner, using causal modelling and inference to eliminate errors caused by unmeasured confounder variables. We further propose a method to learn offline the partial parameterisation of the causal model for planning, from ground truth model data. We evaluate our methods on a toy problem with an unobserved confounder and show that the learned causal model is highly accurate, while our planning method is more robust to confounding and produces overall higher performing policies than AR-DESPOT.
|
| |
| 14:24-14:30, Paper MoBT1.5 | Add to My Program |
| Recurrent Macro Actions Generator for POMDP Planning |
|
| Liang, Yuanchu | The Australian National University |
| Kurniawati, Hanna | Australian National University |
Keywords: Task and Motion Planning, AI-Based Methods, Probability and Statistical Methods
Abstract: Many planning problems in robotics require long planning horizon and uncertain in nature. The Partially Observable Markov Descision Process (POMDP) is a mathematically principled framework for planning under uncertainty. To alleviate the difficulties of computing good approximate POMDP solutions for long horizon problems, one often plans using macro actions, where each macro action is a chain of primitive actions. Such a strategy reduces the effective planning horizon of the problem, and hence reduces the computational complexity for solving. The difficulty is in generating a set of suitable macro actions. In this paper, we present a simple recurrent neural network that learns to generate suitable sets of candidate macro actions that exploits environment information.Key to this learning method is to represent the raw partial information from the environment as a latent problem instance,and sequentially generate macro actions conditioned on the past information. We compare our proposed method with state-of-the-art [1] on four different long horizon planning tasks with various difficulties. The results indicate the quality of the policies computed using macro actions generated by our proposed method consistently exceeds benchmarks. Our implementation can be accessed at https://github.com/YC-Liang/Recurrent-Macro-Action-Generator.
|
| |
| 14:30-14:36, Paper MoBT1.6 | Add to My Program |
| Task Planning and Motion Control with Temporal Logic Specifications |
|
| Pereira, Marcos S. | Universidade Federal De Minas Gerais |
| Pimenta, Luciano | Universidade Federal De Minas Gerais |
| Adorno, Bruno Vilhena | The University of Manchester |
Keywords: Task and Motion Planning, Formal Methods in Robotics and Automation, Motion Control
Abstract: This paper proposes a task planning and motion control framework that generates task plans for a linear temporal logic specification (LTL), which are then executed using a task-space constrained motion controller and a local task planner that overcomes local minima. We propose a new encoding for task specifications, directly in the task-space, as constraints of a mixed-integer linear program that can be used with off-the-shelf LTL linear encoding. We apply our framework to plan and execute trajectories for a free-flying robot and show that the task plan is accomplished without collisions, even in the presence of unexpected moving obstacles that are not considered in the planning phase, while control signal constraints are satisfied. To evaluate the local minima avoidance, we compare the local task planner with a sampling-based motion planner, and the results show a smoother trajectory with a faster execution and less total planning time when using our framework. Last, our framework scaled well with a longer LTL specification, as opposed to automata-based frameworks that usually suffer with the curse of the dimensionality.
|
| |
| 14:36-14:42, Paper MoBT1.7 | Add to My Program |
| Simultaneous Action and Grasp Feasibility Prediction for Task and Motion Planning through Multi-Task Learning |
|
| Ait Bouhsain, Smail | LAAS-CNRS |
| Alami, Rachid | CNRS |
| Simeon, Thierry | LAAS-CNRS |
Keywords: Task and Motion Planning, Deep Learning in Grasping and Manipulation, Manipulation Planning
Abstract: In this paper, we address task and motion planning (TAMP) which is an important yet challenging robotics problem. It is known to suffer from the high combinatorial complexity of discrete search, often requiring a large number of geometric planning calls. We build upon recent works in TAMP by taking advantage of learning methods to provide action feasibility information as a heuristic to the symbolic planner, thus guiding it to a geometrically feasible solution and reducing geometric planning time. We propose AGFP-Net, a multi-task neural network predicting not only action feasibility, but also the feasibility of a set of grasp types. We also propose an improved feasibility-informed TAMP algorithm capable of solving more complex problems, and handling goals which are not fully specified. Comparative results obtained on different problems of varying complexity show that our method is able to greatly reduce task and motion planning time.
|
| |
| 14:42-14:48, Paper MoBT1.8 | Add to My Program |
| Differentiable Task Assignment and Motion Planning |
|
| Envall, Jimmy | ETH Zurich |
| Poranne, Roi | University of Haifa |
| Coros, Stelian | ETH Zurich |
Keywords: Task and Motion Planning, Cooperating Robots, Manipulation Planning
Abstract: Task and motion planning is one of the key problems in robotics today. It is often formulated as a discrete task allocation problem combined with continuous motion planning. Many existing approaches to TAMP involve explicit descriptions of task primitives that cause discrete changes in the kinematic relationship between the actor and the objects. In this work we propose an alternative, fully differentiable approach which supports a large number of TAMP problem instances. Rather than explicitly enumerating task primitives, actions are instead represented implicitly as part of the solution to a nonlinear optimization problem. We focus on decision making for robotic manipulators, specifically for pick and place tasks, and explore the efficacy of the model through a number of simulated experiments including multiple robots, objects and interactions with the environment. We also show several possible extensions.
|
| |
| 14:48-14:54, Paper MoBT1.9 | Add to My Program |
| Effectively Rearranging Heterogeneous Objects on Cluttered Tabletops |
|
| Gao, Kai | Rutgers University |
| Yu, Justin | Rutgers University |
| Punjabi, Tanay Sandeep | Rutgers |
| Yu, Jingjin | Rutgers University |
Keywords: Task Planning, Manipulation Planning, Logistics
Abstract: Effectively rearranging heterogeneous objects constitutes a high-utility skill that an intelligent robot should master. Whereas significant work has been devoted to the grasp synthesis of heterogeneous objects, little attention has been given to the planning for sequentially manipulating such objects. In this work, we examine the long-horizon sequential rearrangement of heterogeneous objects in a tabletop setting, addressing not just generating feasible plans but near-optimal ones. Toward that end, and building on previous methods, including combinatorial algorithms and Monte Carlo tree search-based solutions, we develop state-of-the-art solvers for optimizing two practical objective functions considering key object properties such as size and weight. Thorough simulation studies show that our methods provide significant advantages in handling challenging heterogeneous object rearrangement problems, especially in cluttered settings. Real robot experiments further demonstrate and confirm these advantages. Source code and evaluation data associated with this research will be available at https://github.com/arc-l/TRLB upon the publication of this manuscript.
|
| |
| 14:54-15:00, Paper MoBT1.10 | Add to My Program |
| Semantics-Aware Mission Adaptation for Autonomous Exploration in Urban Environments |
|
| Moon, Sangwoo | Jet Propulsion Laboratory, NASA |
| Peltzer, Oriana | Stanford University |
| Ott, Joshua | Stanford University |
| Kim, Sung-Kyun | NASA Jet Propulsion Laboratory, Caltech |
| Agha-mohammadi, Ali-akbar | NASA-JPL, Caltech |
Keywords: Task Planning, Task and Motion Planning, Planning, Scheduling and Coordination
Abstract: Robust mission planning is an essential component for mission autonomy to perform complicated tasks in extreme environments. In this paper, we are interested in the role of semantic abstractions for guiding autonomous mission planning. In particular, we focus on how semantics can be leveraged to transition, at the mission level, in between individually robust task plans. We present a mission autonomy framework wherein a task plan adaptation policy leverages up-to-date semantics information in order to adapt to changes that occur during run-time, which endows the robot with better resiliency to unexpected events and improves the overall efficiency of mission operations. Under this new perspective, we provide a concrete and challenging application of autonomous exploration and radio source seeking in a complex multi-level building environment. Experimental results over simulations and real hardware tests demonstrate that the presented semantics-aware mission adaptation more effectively completes the mission with better qualitative results compared to a non-adaptive baseline.
|
| |
| 15:00-15:06, Paper MoBT1.11 | Add to My Program |
| Optimal Cost-Preference Trade-Off Planning with Multiple Temporal Tasks |
|
| Amorese, Peter | University of Colorado Boulder |
| Lahijanian, Morteza | University of Colorado Boulder |
Keywords: Task Planning, Task and Motion Planning, Motion and Path Planning
Abstract: Autonomous robots are increasingly utilized in realistic scenarios with multiple complex tasks. In these scenarios, there may be a preferred way of completing all of the given tasks, but it is often in conflict with optimal execution. Recent work studies preference-based planning, however, they have yet to extend the notion of preference to the behavior of the robot with respect to each task. In this work, we introduce a novel notion of preference that provides a generalized framework to express preferences over individual tasks as well as their relations. Then, we perform an optimal trade-off (Pareto) analysis between behaviors that adhere to the user's preference and the ones that are resource optimal. We introduce an efficient planning framework that generates Pareto-optimal plans given user's preference by extending A* search. Further, we show a method of computing the entire Pareto front (the set of all optimal trade-offs) via an adaptation of a multi-objective A* algorithm. We also present a problem-agnostic search heuristic to enable scalability. We illustrate the power of the framework on both mobile robots and manipulators. Our benchmarks show the effectiveness of the heuristic with up to 2-orders of magnitude speedup.
|
| |
| 15:06-15:12, Paper MoBT1.12 | Add to My Program |
| Optimal and Stable Multi-Layer Object Rearrangement on a Tabletop |
|
| Xu, Andy | Rutgers University |
| Gao, Kai | Rutgers University |
| Feng, Si Wei | Rutgers University |
| Yu, Jingjin | Rutgers University |
Keywords: Task Planning, Assembly, Manipulation Planning
Abstract: Object rearrangement is a fundamental sub-task in accomplishing a great many physical tasks. As such, effectively executing rearrangement is an important skill for intelligent robots to master. In this study, we conduct the first algorithmic study on optimally solving the problem of Multi-layer Object Rearrangement on a Tabletop (MORT), in which one object may be relocated at a time, and an object can only be moved if other objects do not block its top surface. In addition, any intermediate structure during the reconfiguration process must be physically stable, i.e., it should stand without external support. To tackle the dual challenges of untangling the dependencies between objects and ensuring structural stability, we develop an algorithm that interleaves the computation of the optimal rearrangement plan and structural stability checking. Using a carefully constructed integer linear programming (ILP) model, our algorithm, Stability-aware Integer Programming-based Planner (SIPP), readily scales to optimally solve complex rearrangement problems of 3D structures with over 60 building blocks, with solution quality significantly outperforming natural greedy best-first approaches. Upon the publication of the manuscript, source code and data will be available at https://github.com/arc-l/mort/
|
| |
| 15:12-15:18, Paper MoBT1.13 | Add to My Program |
| Task and Motion Planning with Large Language Models for Object Rearrangement |
|
| Ding, Yan | SUNY Binghamton |
| Zhang, Xiaohan | SUNY Binghamton |
| Paxton, Chris | Meta AI |
| Zhang, Shiqi | SUNY Binghamton |
Keywords: Task and Motion Planning, Service Robotics
Abstract: Multi-object rearrangement is a crucial skill for service robots, and commonsense reasoning is frequently needed in this process. However, achieving commonsense arrangements requires knowledge about objects, which is hard to transfer to robots. Large language models (LLMs) are one potential source of this knowledge, but they do not naively capture information about plausible physical arrangements of the world. We propose LLM-GROP, which uses prompting to extract commonsense knowledge about functional, semantically valid object configurations from an LLM, and instantiates them with a task and motion planner in order to generalize to varying scene geometry. LLM-GROP allows us to go from natural-language commands to human-aligned object rearrangement in varied environments. Based on human evaluations, our approach achieved the highest rating while outperforming competitive baselines in terms of success rate while maintaining comparable cumulative action costs. Finally, we demonstrate a practical implementation of LLM-GROP on a mobile manipulator in real-world scenarios.
|
| |
| 15:18-15:24, Paper MoBT1.14 | Add to My Program |
| Synergistic Task and Motion Planning with Reinforcement Learning-Based Non-Prehensile Actions |
|
| Liu, Gaoyuan | Vrije Universiteit Brussel |
| De Winter, Joris | Vrije Universiteit Brussel |
| Steckelmacher, Denis | Vrije Universiteit Brussel |
| Hota, Roshan Kumar | Indian Institute of Technology, Kharagpur, India |
| Now�, Ann | VUB |
| Vanderborght, Bram | Vrije Universiteit Brussel |
Keywords: Task and Motion Planning, Reinforcement Learning, Manipulation Planning
Abstract: Robotic manipulation in cluttered environments requires synergistic planning among prehensile and non- prehensile actions. Previous works on sampling-based Task and Motion Planning (TAMP) algorithms, e.g. PDDLStream, provide a fast and generalizable solution for multi-modal manipulation. However, they are likely to fail in cluttered scenarios where no collision-free grasping approaches can be sampled without preliminary manipulations. To extend the ability of sampling-based algorithms, we integrate a vision- based Reinforcement Learning (RL) non-prehensile procedure, pusher. The pushing actions generated by pusher can eliminate interlocked situations and make the grasping problem solvable. Also, the sampling-based algorithm evaluates the pushing ac- tions by providing rewards in the training process, thus the pusher can learn to avoid situations leading to irreversible failures. The proposed hybrid planning method is validated on a cluttered bin-picking problem and implemented in both simulation and real world. Results show that the pusher can effectively improve the success ratio of the previous sampling- based algorithm, while the sampling-based algorithm can help the pusher learn pushing skills.
|
| |
| MoBT2 Regular session, 140B |
Add to My Program |
| Prosthesis Design and Control |
|
| |
| Chair: Lenzi, Tommaso | University of Utah |
| Co-Chair: Artemiadis, Panagiotis | University of Delaware |
| |
| 14:00-14:06, Paper MoBT2.1 | Add to My Program |
| Improving Amputee Endurance Over Activities of Daily Living with a Robotic Knee-Ankle Prosthesis: A Case Study |
|
| Best, T. Kevin | University of Michigan |
| Laubscher, Curt A. | University of Michigan |
| Cortino, Ross | University of Michigan |
| Cheng, Shihao | University of Michigan, Ann Arbor |
| Gregg, Robert D. | University of Michigan |
Keywords: Prosthetics and Exoskeletons, Rehabilitation Robotics
Abstract: Robotic knee-ankle prostheses have often fallen short relative to passive microprocessor prostheses in time-based clinical outcome tests. User ambulation endurance is an alternative clinical outcome metric that may better highlight the benefits of robotic prostheses. However, previous studies were unable to show endurance benefits due to inaccurate high-level classification, discretized mid-level control, and insufficiently difficult ambulation tasks. In this case study, we present a phase-based mid-level prosthesis controller which yields biomimetic joint kinematics and kinetics that adjust to suit a continuum of tasks. We enrolled an individual with an above-knee amputation and challenged him to perform repeated, rapid laps of a circuit comprising activities of daily living with both his passive prosthesis and a robotic prosthesis. The participant demonstrated improved endurance with the robotic prosthesis and our mid-level controller compared to his passive prosthesis, completing over twice as many total laps before fatigue and muscle discomfort required him to stop. We also show that time-based outcome metrics fail to capture this endurance improvement, suggesting that alternative metrics related to endurance and fatigue may better highlight the clinical benefits of robotic prostheses.
|
| |
| 14:06-14:12, Paper MoBT2.2 | Add to My Program |
| Controlling Powered Prosthesis Kinematics Over Continuous Transitions between Walk and Stair Ascent |
|
| Cheng, Shihao | University of Michigan, Ann Arbor |
| Laubscher, Curt A. | University of Michigan |
| Gregg, Robert D. | University of Michigan |
Keywords: Prosthetics and Exoskeletons, Wearable Robotics, Motion Control
Abstract: One of the primary benefits of emerging powered prosthetic legs is their ability to facilitate step-over-step stair ascent by providing positive mechanical work. Existing control methods typically have distinct steady-state activity modes for walking and stair ascent, where activity transitions involve discretely switching between controllers and often must be initiated with a particular leg. However, these discrete transitions do not necessarily replicate able-bodied joint biomechanics, which have been shown to continuously adjust over a transition stride. This paper presents a phase-based kinematic controller for a powered knee-ankle prosthesis that enables continuous, biomimetic transitions between walking and stair ascent. The controller tracks joint angles from a data-driven kinematic model that continuously interpolates between the steady-state kinematic models, and it allows both the prosthetic and intact leg to lead the transitions. Results from experiments with two transfemoral amputee participants indicate that knee and ankle kinematics smoothly transition between walking and stair ascent, with comparable or lower root mean square errors compared to variations from able-bodied data.
|
| |
| 14:12-14:18, Paper MoBT2.3 | Add to My Program |
| Calibration of a Tibia-Based Phase Variable for Control of Robotic Transtibial Prostheses |
|
| Posh, Ryan | University of Notre Dame |
| Tittle, Jonathan Allen | University of Notre Dame |
| Schmiedeler, James | University of Notre Dame |
| Wensing, Patrick M. | University of Notre Dame |
Keywords: Prosthetics and Exoskeletons
Abstract: Phase variable control based on global tibia kinematics holds promise for predicting gait cycle progression to continuously control robotic transtibial prostheses. Calibration of the phase variable is critical to ensure its monotonic behavior, to approach a linear relationship with gait percentage, and to accurately predict the percentage of gait. This paper compares four calibration approaches using data from 22 able-bodied subjects walking at 14 speeds [1]. The typical pure centering (PC) approach employed for thigh-based phase variables is not viable, yielding monotonic phase progression in fewer than half of the cases. An optimization (OPT) approach found monotonic calibrations in 305/308 cases with high linearity (average R2 of 0.91). Critical point centering (CPC) approximates the OPT performance, with 274/308 monotonic calibrations and an average R2 of 0.85, whereas the related vertical weighted average (VWA) approach was only slightly better than PC. All four approaches are similarly accurate in predicting gait percentage, staying within 5% at least 92.7% of the time.
|
| |
| 14:18-14:24, Paper MoBT2.4 | Add to My Program |
| On Intuitive Control of Ankle-Foot Prostheses: A Sensor Fusion-Based Algorithm for Real-Time Prediction of Transitions to Compliant Surfaces |
|
| Angelidou, Charikleia | University of Delaware |
| Artemiadis, Panagiotis | University of Delaware |
Keywords: Prosthetics and Exoskeletons
Abstract: Substantial research and development on the design and control of robotic ankle-foot prostheses have aimed to restore normal function and movement capacity for people with gait impairments and lower limb amputations. However, prostheses controllers usually fail to incorporate information pertaining to the properties of the walking terrain, such as ground stiffness. There is therefore a need for a framework that adjusts the prostheses parameters according to the user�s intent to transition to a variable impedance terrain. To achieve this, we need to incorporate the human wearer in the control loop of the prosthesis. This work proposes an advanced, high-level controller framework for powered ankle-foot prostheses that combines subject-specific pattern recognition (PR) and classification strategies to predict whether the next step will be on a rigid or compliant surface. Comparing the Support Vector Machine (SVM) and k-Nearest Neighbors (k-NN) classification algorithms for this task, we conclude that by combining a k- NN implementation with a Pattern Recognition Neural Network (PR NN), our method can accurately forecast upcoming surface stiffness transitions in time to allow for prompt adaptation to the new walking terrain. We also show that the sensor fusion of kinematic and surface electromyographic (EMG) data outperforms single-source inputs producing the best prediction results for all subjects with an accuracy of up to 87.5%.
|
| |
| 14:24-14:30, Paper MoBT2.5 | Add to My Program |
| Powered Knee and Ankle Prosthesis Control for Adaptive Ambulation at Variable Speeds, Inclines, and Uneven Terrains |
|
| Sullivan, Liam | University of Utah |
| Creveling, Suzi | University of Utah |
| Cowan, Marissa | University of Utah |
| Gabert, Lukas | University of Utah |
| Lenzi, Tommaso | University of Utah |
Keywords: Prosthetics and Exoskeletons, Rehabilitation Robotics, Wearable Robotics
Abstract: Ambulation in everyday life requires walking at variable speeds, variable inclines, and variable terrains. Powered prostheses aim to provide this adaptability through control of the actuated joints. Some powered prosthesis controllers can adapt to discrete changes in speed and incline but require manual tuning to determine the control parameters, leading to poor clinical viability. Other data-driven controllers can continuously adapt to changes in speed and incline but do so by imposing the same non-amputee gait patterns for all amputee subjects, which does not consider subjective preferences and differing clinical needs of users. Here, we present a controller for powered knee and ankle prostheses that can continuously adapt to different walking speeds, inclines, and uneven terrains without enforcing a specific prosthesis position, impedance, or torque. A virtual biarticular muscle connection determines the knee flexion torque, which changes with both speed and slope. Adaptation to inclines and uneven terrains is based solely on the global shank orientation. Continuously variable damping allows for speed adaptation. Minimum-jerk programming defines the prosthesis swing trajectory at variable cadences. Experiments with one individual with an above-knee amputation suggest that the proposed controller can effectively adapt to different walking speeds, inclines, and rough terrains.
|
| |
| 14:30-14:36, Paper MoBT2.6 | Add to My Program |
| Motor Unit Action Potential Based Classification of Hand and Arm Motions |
|
| Twardowski, Michael | Delsys & Altec Inc |
| Chan, Michael | Delsys Inc |
| Li, Zhi | Worcester Polytechnic Institute |
| De Luca, Gianluca | Delsys Inc |
| Kline, Joshua | Delsys & Altec Inc |
| Chiodini, John | Delsys Inc |
Keywords: Brain-Machine Interfaces, Motion Control, Prosthetics and Exoskeletons
Abstract: While motion classification architectures have improved in accuracy and robustness in recent years, computationally expensive approaches and sophisticated hardware dependencies limit their real-world applicability. To overcome these challenges, we have designed a lightweight, real-time architecture for classifying motions of the arm & hand using features derived from motor unit action potentials within surface Electromyographic (sEMG) signals, rather than which provide direct interrogation of underlying muscle activation patterns. We tested the architecture on 6 motions performed dynamically across a range of muscle contraction intensities achieving median classification accuracies ranging from 91.3% to 93.3% and an average processing time of approximately 40 ms across three different classifiers. Taken together, our findings demonstrate potential robustness of motor unit based neural interfaces for motion classification tasks.
|
| |
| 14:36-14:42, Paper MoBT2.7 | Add to My Program |
| Adjusting the Quasi-Stiffness of an Ankle-Foot Prosthesis Improves Walking Stability During Locomotion Over Compliant Terrain |
|
| Karakasis, Chrysostomos | University of Delaware, Mechanical Engineering Department |
| Salati, Robert | University of Delaware |
| Artemiadis, Panagiotis | University of Delaware |
Keywords: Prosthetics and Exoskeletons
Abstract: Despite significant advances in the design of robotic lower-limb prostheses for individuals with impaired mobility, there is a need for further progress in improving the robustness, safety, and stability of these devices in a wide range of activities of daily living. Although powered prostheses have been able to adapt to different speeds, conditions, and rigid terrains, no control strategies have been proposed for addressing walking over compliant surfaces. This work proposes a continuous admittance controller that adjusts the ankle quasi-stiffness of a powered ankle-foot prosthesis and improves gait stability during locomotion over compliant terrain. The proposed controller is evaluated with walking experiments on an instrumented treadmill that can accurately change the walking surface stiffness. In these experiments, the proposed controller accurately changes the prosthesis ankle quasi-stiffness across a wide range of 10 − 20 Nm/deg, while improving local dynamic deg stability compared to a standard phase-variable controller. The proposed controller can significantly improve the performance of lower-limb prostheses in dynamic and compliant environments frequently encountered in daily activities, resulting in improved quality of life for people with lower-limb amputation.
|
| |
| 14:42-14:48, Paper MoBT2.8 | Add to My Program |
| A Unified Controller for Natural Ambulation on Stairs and Level Ground with a Powered Robotic Knee Prosthesis |
|
| Cowan, Marissa | University of Utah |
| Creveling, Suzi | University of Utah |
| Sullivan, Liam | University of Utah |
| Gabert, Lukas | University of Utah |
| Lenzi, Tommaso | University of Utah |
Keywords: Prosthetics and Exoskeletons, Rehabilitation Robotics, Wearable Robotics
Abstract: Powered lower-limb prostheses have the potential to improve amputee mobility by closely imitating the biomechanical function of the missing biological leg. To accomplish this goal, powered prostheses need controllers that can seamlessly adapt to the ambulation activity intended by the user. Most powered prosthesis control architectures address this issue by switching between specific controllers for each activity. This approach requires online classification of the intended ambulation activity. Unfortunately, any misclassification can cause the prosthesis to perform a different movement than the user expects, increasing the likelihood of falls and injuries. Therefore, classification approaches require near-perfect accuracy to be used safely in real life. In this paper, we propose a unified controller for powered knee prostheses which allows for walking, stair ascent, and stair descent without the need for explicit activity classification. Experiments with one individual with an above-knee amputation show that the proposed controller enables seamless transitions between activities. Moreover, transition between activities is possible while leading with either the sound-side or the prosthesis. A controller with these characteristics has the potential to improve amputee mobility.
|
| |
| 14:48-14:54, Paper MoBT2.9 | Add to My Program |
| Volitional EMG Control Enables Stair Climbing with a Robotic Powered Knee Prosthesis |
|
| Creveling, Suzi | University of Utah |
| Cowan, Marissa | University of Utah |
| Sullivan, Liam | University of Utah |
| Gabert, Lukas | University of Utah |
| Lenzi, Tommaso | University of Utah |
Keywords: Prosthetics and Exoskeletons, Wearable Robotics, Cyborgs
Abstract: Existing controllers for robotic powered prostheses regulate the prosthesis speed, timing, and energy generation using predefined position or torque trajectories. This approach enables climbing stairs step-over-step. However, it does not provide amputees with direct volitional control of the robotic prosthesis, a functionality necessary to restore full mobility to the user. Here we show that proportional electromyographic (EMG) control of the prosthesis knee torque enables volitional control of a powered knee prosthesis during stair climbing. The proposed EMG controller continuously regulates knee torque based on activation of the residual hamstrings, measured using a single EMG electrode located within the socket. The EMG signal is mapped to a desired knee flexion/extension torque based on the prosthesis knee position, the residual limb position, and the interaction with the ground. As a result, the proposed EMG controller enabled an above-knee amputee to climb stairs at different speeds, while carrying additional loads, and even backwards. By enabling direct, volitional control of powered robotic knee prostheses, the proposed EMG controller has the potential to improve amputee mobility in the real world.
|
| |
| 14:54-15:00, Paper MoBT2.10 | Add to My Program |
| Development and Online Validation of an Intrinsic Fault Detector for a Powered Robotic Knee Prosthesis |
|
| Naseri, Amirreza | North Carolina State University |
| Liu, Ming | North Carolina State University |
| Lee, I-Chieh | UNC/NCSU Joint Department of Biomedical Engineering |
| Huang, He (Helen) | North Carolina State University and University of North Carolina |
Keywords: Prosthetics and Exoskeletons, Safety in HRI, Physical Human-Robot Interaction
Abstract: Robotic prosthetic legs have the potential to significantly improve the quality of life for lower limb amputees to perform locomotion in various environments and task conditions. However, these devices lack the capability to recover from internal intrinsic control faults, which can lead to harmful consequences affecting the user�s gait performance and eroding trust in these robotic devices. Therefore, a reliable fault detection system is necessary to detect intrinsic faults in a timely manner and provide a compensatory response to mitigate their effects. This paper focuses on designing an active fault detector for a robotic knee prosthesis and demonstrates its effectiveness in real time. The developed system utilizes a Gaussian Process model to estimate knee angular velocity, which is sensitive to intrinsic faults and relies on the difference between estimated velocity and the actual measurement to detect internal control faults. In an offline analysis, the developed detector demonstrated a higher detection rate, lower false alarm ratio, and faster detection time compared with the two approaches reported previously. An online demonstration was also conducted with a unilateral amputee participant and showed performance similar to that of offline analysis. We expect that this detector can be integrated into a fault tolerance strategy to enhance the reliability and safety of robotic prosthetic legs, enabling users to perform their everyday tasks with greater confidence.
|
| |
| 15:00-15:06, Paper MoBT2.11 | Add to My Program |
| A Feasibility Study of Piecewise Phase Variable Based on Variable Toe-Off for the Powered Prosthesis Control: A Case Study |
|
| Hong, Woolim | North Carolina State University |
| Anil Kumar, Namita | Johnson and Johnson |
| Patrick, Shawanee | Texas A&M |
| Moon, Sunwoong | Gwangju Institute of Science and Technology |
| Hur, Pilwon | Gwangju Institute of Science and Technology |
Keywords: Prosthetics and Exoskeletons, Wearable Robotics, Rehabilitation Robotics
Abstract: To achieve stable walking and provide proper assistance, it is crucial to have synchronized control of the prosthesis, treating the user and the prosthesis as a coupled system. Additionally, speed adaptability is important for controlling the prosthesis at different walking speeds. One approach to achieving this is by using a phase variable to estimate the user�s gait phase and control the prosthesis in synchrony with the user. However, the current phase variable (i.e., PV) cannot reflect variable toe-off timing at different speeds, although individuals have different toe-off timings per walking speed. To address this issue, we propose a piecewise phase variable (i.e., PW-PV) that can be adjusted for different toe-off timings while estimating the user�s gait phase at various walking speeds. As a case study, we conducted a treadmill walking experiment with two participants (i.e., one healthy and one amputee) using a custom-built powered prosthesis. We collected and analyzed joint kinematics, kinetics, and ground reaction force data to validate the feasibility of the PW-PV. The use of the PW-PV resulted in both participants experiencing faster load transfer and a more natural rollover while walking. This allowed healthy and amputee participants to walk with longer push-off durations of 10.6% and 15.2%, respectively, and greater ankle push-off work of 7.3% and 16.9%. Furthermore, with the PW-PV, the amputee participant demonstrated higher vertical ground reaction forces of 5.4% and 4.7% on her prosthesis side leg during load acceptance and push-off periods, potentially suggesting increased confidence in using the prosthesis. We anticipate that by using the proposed phase variable, we will be able to provide more appropriate and timely assistance to individuals at variable walking speeds.
|
| |
| 15:06-15:12, Paper MoBT2.12 | Add to My Program |
| A Wearable Force-Sensitive and Body-Aware Exoprosthesis for a Transhumeral Prosthesis Socket (I) |
|
| Toedtheide, Alexander | Technical University of Munich, Chair of Robotics and Systems In |
| Pozo Fortunić, Edmundo | Technical University of Munich |
| Kuehn, Johannes | Technical University of Munich |
| Jensen, Elisabeth Rose | Technical University of Munich |
| Haddadin, Sami | Technical University of Munich |
Keywords: Prosthetics and Exoskeletons, Wearable Robots, Mechanism Design, Haptics and Haptic Interfaces
Abstract: Upper limb prostheses are commonly mounted to the human residual limb by a passive socket. By this design, the sensitive residual limb is exposed to reaction wrenches, which can be a source of medical complications. In this work, we introduce an active force-sensitive robotic socket which carries the prosthesis, offloads the residual limb and allows guidance via small interaction forces at the same time. We investigate the feasibility of this concept by a force-sensitive and wearable shoulder exoskeleton, called exoprosthesis when being combined with a prosthesis. We provide a first mechatronics prototype, two floating base controllers and an analysis of the loads acting on the user. Simulations and experiments confirmed the concept and revealed that the wrench at residual limb can be compensated for the static case and by 50% for the investigated motions. Human-in-the-loop tests were successfully performed by three able-bodied users showing a real world use case in a complex grasping situation. Overall, we believe that a force-sensitive robotic socket has the potential to advance prosthetics to a new level as it provides an intuitive and seamless user control interface.
|
| |
| MoBT3 Regular session, 140C |
Add to My Program |
| Collision Avoidance II |
|
| |
| Chair: Song, Kai-Tai | National Yang Ming Chiao Tung University |
| Co-Chair: Firoozi, Roya | Stanford University |
| |
| 14:00-14:06, Paper MoBT3.1 | Add to My Program |
| AdaptiveON: Adaptive Outdoor Local Navigation Method for Stable and Reliable Actions |
|
| Liang, Jing | University of Maryland |
| Kulathun Mudiyanselage, Kasun Weerakoon | University of Maryland, College Park |
| Guan, Tianrui | University of Maryland |
| Karapetyan, Nare | University of Maryland |
| Manocha, Dinesh | University of Maryland |
Keywords: Motion and Path Planning, Planning under Uncertainty, Collision Avoidance
Abstract: We present a novel outdoor navigation algorithm to generate stable and efficient actions to navigate a robot to reach a goal. We use a multi-stage training pipeline and show that our approach produces policies that result in stable and reliable robot navigation on complex terrains. Based on the Proximal Policy Optimization (PPO) algorithm, we developed a novel method to achieve multiple capabilities for outdoor local navigation tasks, namely alleviating the robot�s drifting, keeping the robot stable on bumpy terrains, avoiding climbing on hills with steep elevation changes, and avoiding collisions. Our training process mitigates the reality (sim-to-real) gap by introducing generalized environmental and robotic parameters and training with rich features captured from light detection and ranging (Lidar) sensor in a high-fidelity Unity simulator. We evaluate our method in both simulation and real-world environments using Clearpath Husky and Jackal robots. Further, we compare our method against the state-of-the-art approaches and observe that, in the real world, our method improves stability by at least 30.7% on uneven terrains, reduces drifting by 8.08%, and decreases the elevation changes by 14.75%.
|
| |
| 14:06-14:12, Paper MoBT3.2 | Add to My Program |
| Intention Communication and Hypothesis Likelihood in Game-Theoretic Motion Planning |
|
| Chahine, Makram | Massachusetts Institute of Technology |
| Firoozi, Roya | Stanford University |
| Xiao, Wei | MIT |
| Schwager, Mac | Stanford University |
| Rus, Daniela | MIT |
Keywords: Path Planning for Multiple Mobile Robots or Agents, Planning under Uncertainty, Robot Safety
Abstract: Game-theoretic motion planners are a potent solu- tion for controlling systems of multiple highly interactive robots. Most existing game-theoretic planners unrealistically assume a priori objective function knowledge is available to all agents. To address this, we propose a fault-tolerant receding horizon game-theoretic motion planner that leverages inter-agent com- munication with intention hypothesis likelihood. Specifically, robots communicate their objective function incorporating their intentions. A discrete Bayesian filter is designed to infer the ob- jectives in real-time based on the discrepancy between observed trajectories and the ones from communicated intentions. In simulation, we consider three safety-critical autonomous driving scenarios of overtaking, lane-merging and intersection crossing, to demonstrate our planner�s ability to capitalize on alternative intention hypotheses to generate safe trajectories in the presence of faulty transmissions in the communication network.
|
| |
| 14:12-14:18, Paper MoBT3.3 | Add to My Program |
| Collision-Free Reconfiguration Planning for Variable Topology Trusses Using a Linking Invariant |
|
| Spinos, Alexander | University of Pennsylvania |
| Yim, Mark | University of Pennsylvania |
Keywords: Cellular and Modular Robots, Motion and Path Planning, Computational Geometry
Abstract: We introduce a multi-modal reconfiguration planner for the Variable Topology Truss (VTT) modular robot system. The VTT system is a truss-architecture modular self-reconfigurable robot. When a VTT is restricted to a single topology, the collision constraints between the truss members divide the configuration space into many connected components, which makes collision-free planning difficult. This new planner leverages a mathematical invariant based on link theory to find topological reconfiguration actions that can connect these different regions and make progress towards a goal. We show that this planner is effective at finding paths between configurations with different truss topologies.
|
| |
| 14:18-14:24, Paper MoBT3.4 | Add to My Program |
| Hybrid Map-Based Path Planning for Robot Navigation in Unstructured Environments |
|
| Liu, Jiayang | National University of Defense Technology |
| Chen, Xieyuanli | National University of Defense Technology |
| Xiao, Junhao | National University of Defense Technology |
| Sichao, Lin | National University of Defense Technology |
| Zheng, Zhiqiang | National University of Defense Technology |
| Lu, Huimin | National University of Defense Technology |
Keywords: Motion and Path Planning, Collision Avoidance, Autonomous Vehicle Navigation
Abstract: Fast and accurate path planning is important for ground robots to achieve safe and efficient autonomous navigation in unstructured outdoor environments. However, most existing methods exploiting either 2D or 2.5D maps struggle to balance the efficiency and safety for ground robots navigating in such challenging scenarios. In this paper, we propose a novel hybrid map representation by fusing a 2D grid and a 2.5D digital elevation map. Based on it, a novel path planning method is proposed, which considers the robot poses during traversability estimation. By doing so, our method explicitly takes safety as a planning constraint enabling robots to navigate unstructured environments smoothly. The proposed approach has been evaluated on both simulated datasets and a real robot platform. The experimental results demonstrate the efficiency and effectiveness of the proposed method. Compared to state-of-the-art baseline methods, the proposed approach consistently generates safer and easier paths for the robot in different unstructured outdoor environments. The implementation of our method is publicly available at https://github.com/nubot-nudt/T-Hybrid-planner.
|
| |
| 14:24-14:30, Paper MoBT3.5 | Add to My Program |
| CDT-Dijkstra: Fast Planning of Globally Optimal Paths for All Points in 2D Continuous Space |
|
| Liu, Jinyuan | Zhejiang University of Technology |
| Fu, Minglei | Zhejiang University of Technology |
| Zhang, Wen-An | Zhejiang University of Technology, China |
| Chen, Bo | Zhejiang University of Technology |
| Prakapovich, Ryhor | United Institute of Informatics Problems of theNationalAcademy O |
| Sychou, Uladzislau | United Institute of Informatics Problems of the NationalAcademy |
Keywords: Motion and Path Planning, Autonomous Vehicle Navigation, Collision Avoidance
Abstract: The Dijkstra algorithm is a classic path planning method, which in a discrete graph space, can start from a specified source node and find the shortest path between the source node and all other nodes in the graph. However, to the best of our knowledge, there is no effective method that achieves a function similar to that of the Dijkstra's algorithm in a continuous space. In this study, an optimal path planning algorithm called convex dissection topology (CDT)-Dijkstra is developed, which can quickly compute the global optimal path from one point to all other points in a 2D continuous space. CDT-Dijkstra is mainly divided into two stages: SetInit and GetGoal. In SetInit, the algorithm can quickly obtain the optimal CDT encoding set of all the cut lines based on the initial point. In GetGoal, the algorithm can return the global optimal path of any goal point at an extremely high speed. In this study, we propose and prove the planning principle of considering only the points on the cutlines, thus reducing the state space of the distance optimal path planning task from 2D to 1D. In addition, we propose a fast method to find the optimal path in a homogeneous class and theoretically prove the correctness of the method. Finally, by testing in a series of environments, the experimental results demonstrate that CDT-Dijkstra not only plans the optimal path from all points at once, but also has a significant advantage over advanced algorithms considering certain complex tasks.
|
| |
| 14:30-14:36, Paper MoBT3.6 | Add to My Program |
| Large Scale Pursuit-Evasion under Collision Avoidance Using Deep Reinforcement Learning |
|
| Yang, Helei | Zhejiang University |
| Ge, Peng | Zhejiang University |
| Cao, Junjie | Institute of Cyber Systems and Control, Zhejiang University |
| Yang, Yifan | ZheJiang University |
| Liu, Yong | Zhejiang University |
Keywords: Multi-Robot Systems, Collision Avoidance, Autonomous Agents
Abstract: This paper examines a pursuit-evasion game (PEG) involving multiple pursuers and evaders. The decentralized pursuers aim to collaborate to capture the faster evaders while avoiding collisions. The policies of all agents are learning-based and are subjected to kinematic constraints that are specific to unicycles. To address the challenge of high dimensionality encountered in large-scale scenarios, we propose a state processing method named Mix-Attention, which is based on Self-Attention. This method effectively mitigates the curse of dimensionality. The simulation results provided in this study demonstrate that the combination of Mix-Attention and Independent Proximal Policy Optimization (IPPO) surpasses alternative approaches when solving the multi-pursuer multi-evader PEG, particularly as the number of entities increases. Moreover, the trained policies showcase their ability to adapt to scenarios involving varying numbers of agents and obstacles without requiring retraining. This adaptability showcases their transferability and robustness. Finally, our proposed approach has been validated through physical experiments conducted with six robots.
|
| |
| 14:36-14:42, Paper MoBT3.7 | Add to My Program |
| A Gaussian Variational Inference Approach to Motion Planning |
|
| Yu, Hongzhe | Georgia Institute of Technology |
| Chen, Yongxin | Georgia Institute of Technology |
Keywords: Motion and Path Planning, Planning under Uncertainty, Optimization and Optimal Control
Abstract: We propose a Gaussian variational inference framework for the motion planning problem. In this framework, motion planning is formulated as an optimization over the distribution of the trajectories to approximate the desired trajectory distribution by a tractable Gaussian distribution. Equivalently, the proposed framework can be viewed as a standard motion planning with an entropy regularization. Thus, the solution obtained is a transition from an optimal deterministic solution to a stochastic one, and the proposed framework can recover the deterministic solution by controlling the level of stochasticity. To solve this optimization, we adopt the natural gradient descent scheme. The sparsity structure of the proposed formulation induced by factorized objective functions is further leveraged to improve the scalability of the algorithm. We evaluate our method on several robot systems in simulated environments, and show that it achieves collision avoidance with smooth trajectories, and meanwhile brings robustness to the deterministic baseline results, especially in challenging environments and tasks.
|
| |
| 14:42-14:48, Paper MoBT3.8 | Add to My Program |
| Exploring Social Motion Latent Space and Human Awareness for Effective Robot Navigation in Crowded Environments |
|
| Ansari, Junaid Ahmed | Tata Consultancy Services |
| Tourani, Satyajit | TCS |
| Kumar, Gourav | Tata Consultancy Services, Kolkata , India |
| Bhowmick, Brojeshwar | Tata Consultancy Services |
Keywords: Collision Avoidance, Social HRI, Machine Learning for Robot Control
Abstract: This work proposes a novel approach to social robot navigation by learning to generate robot controls from a social motion latent space. By leveraging this social motion latent space, the proposed method achieves significant improvements in social navigation metrics such as success rate, navigation time, and trajectory length while producing smoother (less jerk and angular deviations) and more anticipatory trajectories. The superiority of the proposed method is demonstrated through comparison with baseline models in various scenarios. Additionally, the concept of humans� awareness towards the robot is introduced into the social robot navigation framework showing that incorporating human awareness leads to shorter and smoother trajectories owing to humans� ability to positively interact with the robot.
|
| |
| 14:48-14:54, Paper MoBT3.9 | Add to My Program |
| DS-MPEPC: Safe and Deadlock-Avoiding Robot Navigation in Cluttered Dynamic Scenes |
|
| Arul, Senthil Hariharan | University of Maryland, College Park |
| Park, Jong Jin | Amazon Lab126 |
| Manocha, Dinesh | University of Maryland |
Keywords: Motion and Path Planning, Collision Avoidance
Abstract: We present an algorithm for safe robot navigation in complex dynamic environments using a variant of model predictive equilibrium point control. We use an optimization formulation to navigate robots gracefully in dynamic environments by optimizing over a trajectory cost function at each timestep. We present a novel trajectory cost formulation that significantly reduces conservative and deadlocking behaviors and generates smooth trajectories. In particular, we propose a new collision probability function that effectively captures the risk associated with a given configuration and the time to avoid collisions based on the velocity direction. Moreover, we propose a terminal state cost based on the expected time-to-goal and time-to-collision values that helps in avoiding trajectories that could result in deadlock. We evaluate our cost formulation in multiple simulated scenarios, including narrow corridors with dynamic obstacles, and observe significantly improved navigation behavior and reduced deadlocks as compared to prior methods.
|
| |
| 14:54-15:00, Paper MoBT3.10 | Add to My Program |
| 3D-Online Generalized Sensed Shape Expansion: A Probabilistically Complete Motion Planner in Obstacle-Cluttered Unknown Environments |
|
| Zinage, Vrushabh | University of Texas at Austin |
| Arul, Senthil Hariharan | University of Maryland, College Park |
| Manocha, Dinesh | University of Maryland |
| Ghosh, Satadal | Indian Institute of Technology Madras |
Keywords: Collision Avoidance, Motion and Path Planning, Simulation and Animation
Abstract: We present an online motion planning algorithm (3D-OGSSE) for generating smooth, collision-free trajectories over multiple planning iterations for a 3-D agent operating in an unknown, obstacle-cluttered, 3-D environment. In each planning iteration, 3D-OGSSE constructs an obstacle-free region termed `generalized sensed shape' based on the locally-sensed environment information and the notion of generalized shape. A collision-free path is computed by sampling points in the generalized sensed shape and is used to generate a smooth, time-parametrized trajectory by minimizing snap. The generated trajectory at every planning iteration is constrained to lie within generalized sensed shape, which ensures the agent maneuvers in locally obstacle-free space. As the agent reaches the boundary of the generalized sensed shape in a planning iteration, a re-plan is triggered by a receding horizon planning mechanism that also enables the initialization of the next planning iteration. We also present theoretical guarantee for probabilistic completeness of the developed algorithm over the entire environment and for completely collision-free trajectory generation. We evaluate the proposed method in simulation on complex 3-D environments with varied obstacle-densities. Further, we also evaluate in scenarios with sensor noise and constraints on on-board sensor's field-of-view (FOV). We observe that each planning iteration computation takes approximately 14 milliseconds on a single thread of an Intel Core i5-8500 3.0 GHz CPU, which is significantly faster than several existing algorithms. In addition, we also observe 3D-OGSSE to be less conservative in complex scenarios such as narrow passages.
|
| |
| 15:00-15:06, Paper MoBT3.11 | Add to My Program |
| Safe and Efficient Dynamic Window Approach for Differential Mobile Robots with Stochastic Dynamics Using Deterministic Sampling |
|
| Yasuda, Shinya | NEC Corporation |
| Kumagai, Taichi | NEC Corporation |
| Yoshida, Hiroshi | NEC Corporation |
Keywords: Collision Avoidance, Planning under Uncertainty, Motion and Path Planning
Abstract: We propose an efficient and safe dynamic window approach (DWA) by using deterministic sampling. When the system dynamics have uncertainty, the control input includes errors, so that the DWA objective function becomes a random variable. When a random-choice algorithm with a finite number of samples is used to estimate the objective function, it may miss collisions during prediction. In this work, we approximate the end-state distribution as a one-dimensional distribution for each input candidate in advance and generate sample paths deterministically to eliminate the misses to achieve safe control. Numerical experiments have shown that this method is approximately three times as efficient as the Monte Carlo method in most indoor environments.
|
| |
| 15:06-15:12, Paper MoBT3.12 | Add to My Program |
| Path Re-Planning Design of a Cobot in a Dynamic Environment Based on Current Obstacle Configuration |
|
| Lee, Chuan-Che | National Yang Ming Chiao Tung University |
| Song, Kai-Tai | National Yang Ming Chiao Tung University |
Keywords: Collision Avoidance, Motion and Path Planning, Human-Robot Collaboration
Abstract: This study proposes a path planning algorithm to generate a collision-free path that avoids static and dynamic obstacles in real time. An efficient path re-planning method is presented for obstacle avoidance for a cobot in an environment that is shared by humans and robot. Static and dynamic obstacles are tracked when the manipulator executes a trajectory along a planned initial static path. When a dynamic obstacle enters the robot�s workspace, the proposed method re-plans a collision-free local path to avoid static and dynamic obstacles. To allow fast local re-planning, a hybrid method that combines the advantages of APF and RRT path planning algorithm is proposed. The weight factors for the hybrid method are determined according to the current configuration of obstacles. The experimental results for a TM5-700 manipulator show that the proposed method decreases re-planning time and path length in an environment with static and dynamic obstacles. The path re-planning time is at least 55% less than those for two existent path planning optimization methods D-RRT and VF-RRT.
|
| |
| 15:12-15:18, Paper MoBT3.13 | Add to My Program |
| DRL-VO: Learning to Navigate through Crowded Dynamic Scenes Using Velocity Obstacles (I) |
|
| Xie, Zhanteng | Temple University |
| Dames, Philip | Temple University |
Keywords: Collision Avoidance, Deep Learning in Robotics and Automation, Field Robots, Reactive and Sensor-Based Planning
Abstract: This paper proposes a novel learning-based control policy with strong generalizability to new environments that enables a mobile robot to navigate autonomously through spaces filled with both static obstacles and dense crowds of pedestrians. The policy uses a unique combination of input data to generate the desired steering angle and forward velocity: a short history of lidar data, kinematic data about nearby pedestrians, and a sub-goal point. The policy is trained in a reinforcement learning setting using a reward function that contains a novel term based on velocity obstacles to guide the robot to actively avoid pedestrians and move towards the goal. Through a series of 3D simulated experiments with up to 55 pedestrians, this control policy is able to achieve a better balance between collision avoidance and speed (i.e., higher success rate and faster average speed) than state-of-the-art model-based and learning-based policies, and it also generalizes better to different crowd sizes and unseen environments. An extensive series of hardware experiments demonstrate the ability of this policy to directly work in different real-world environments with different crowd sizes with zero re
|
| |
| MoBT4 Regular session, 140D |
Add to My Program |
| Motion Control |
|
| |
| Chair: Kim, Jeeseop | Caltech |
| Co-Chair: Johnson, Aaron M. | Carnegie Mellon University |
| |
| 14:00-14:06, Paper MoBT4.1 | Add to My Program |
| Model Predictive Control of Autonomous Vehicles with Integrated Barriers Using Occupancy Grid Maps |
|
| Cho, Minsu | Korea Advanced Institute of Science and Techonology |
| Lee, Yeongseok | Korea Advanced Institute of Science and Technology |
| Kim, Kyung-Soo | KAIST(Korea Advanced Institute of Science and Technology) |
Keywords: Integrated Planning and Control, Motion and Path Planning, Autonomous Vehicle Navigation
Abstract: Nonlinear model predictive control (NMPC) is an efficient and proven method for optimization-based autonomous vehicle motion planning. Among the various approaches, the iterative linear quadratic regulator, a differential dynamic programming variant, is a well-known efficient nonlinear optimization method. In safety-critical control systems, controllers should address inequality-constrained optimization problems. In this work, we design a single unified constraint using an occupancy grid map. We convert this inequality-constrained optimization problem into an unconstrained optimization problem by appending a single integrated discrete barrier state to the system model. This approach simplifies complex motion planning problems and reduces computational costs. In the proposed method, we first discretize the surrounding environment with an occupancy grid map and design a single constraint that ensures that only cells with values less than a predefined threshold can be traversed by the ego vehicle. Then, we define a single integrated discrete barrier state to introduce this constraint into the motion planning algorithm. The proposed method, a penalty method, and the augmented Lagrangian method are tested on a real-time software-in-the-loop simulation using CarMaker and ROS. The simulation results of pop-up obstacle avoidance scenarios show the benefits of the proposed method, such as reduced time costs and increased robustness.
|
| |
| 14:06-14:12, Paper MoBT4.2 | Add to My Program |
| Path-Following Control with Path and Orientation Snap-In |
|
| Hartl-Nesic, Christian | TU Wien |
| Pritzi, Elias | TU Wien |
| Kugi, Andreas | TU Wien |
Keywords: Human-Robot Collaboration, Compliance and Impedance Control, Industrial Robots
Abstract: Robots need to be as simple to use as tools in a workshop and allow non-experts to program, modify and execute tasks. In particular for repetitive tasks in high-mix/low-volume production, robotic support and physical human-robot interaction (pHRI) help to significantly increase productivity. In path-following control (PFC), the geometric description of the path is decoupled from the time evolution of the robot's end-effector along the path. PFC is inherently suitable for pHRI since path progress can be derived from the interaction with the human. In this work, an extension to multi-path PFC is proposed, which allows smooth transitions between the paths initiated by the human. Additionally, two pHRI modes called path snap-in and orientation snap-in are proposed, which use attractive forces to snap the robot end-effector onto a path or a predefined orientation. Moreover, the stability properties of PFC are inherited and the method is applicable to linear, nonlinear and self-intersecting paths. The proposed pHRI modes are validated on an experimental drilling task for teach-in (using orientation snap-in) and execution (using path snap-in) with the kinematically redundant collaborative robot KUKA LBR iiwa 14 R820.
|
| |
| 14:12-14:18, Paper MoBT4.3 | Add to My Program |
| Design and Control of a Reluctance-Based Micropositioning Stage for Scanning Motion Applications |
|
| Al Saaideh, Mohammad | Memorial University of Newfoundland |
| Alatawneh, Natheer | Cysca Technologies |
| Aljanaideh, Khaled | Jordan University of Science and Technology |
| Al Janaideh, Mohammad | University of Guelph |
Keywords: Motion Control
Abstract: This paper presents a design and characterization of a micropositioning stage driven by a reluctance actuator. The stage is constructed with a C-core reluctance actuator and four compression springs. The design of the stage is presented using a CAD model, followed by the fabrication process of the prototype. The mathematical model is formulated to present the interaction among the stage's electrical, magnetic, and mechanical dynamic behaviour. Next, the force-current and force-gap characteristics are obtained by measuring the force under different applied currents and air gaps. After that, the system is analyzed to determine the maximum applied voltage that stabilizes the system in an open-loop configuration, followed by the time-domain and frequency-domain response. Finally, the feedforward controller is presented to linearize the dynamic behavior of the stage over a specific range of motion. The experimental results under the feedforward controller show a linear characteristic between the desired force and the output displacement.
|
| |
| 14:18-14:24, Paper MoBT4.4 | Add to My Program |
| Body Posture Controller for Actively Articulated Tracked Vehicles Moving Over Rough and Unknown Terrains |
|
| Santos Rocha, Filipe Augusto | COPPE / Federal University of Rio De Janeiro (UFRJ) |
| Cid, Andr� | Instituto Tecnologico Vale |
| Delunardo, Mario | Instituto Tecnologico Vale |
| P. Junior, Renato | Instituto Tecnologico Vale |
| Costa Pereira de S. Thiago Neto, Nilton | Universidade Federal De Ouro Preto |
| Barros, Luiz | Instituto Tecnologico Vale |
| D. Domingues, Jaco | Instituto Tecnologico Vale |
| Pessin, Gustavo | Instituto Tecnol�gico Vale |
| Freitas, Gustavo | Federal University of Minas Gerais |
| Costa, Ramon | Federal University of Rio De Janeiro |
Keywords: Motion Control, Kinematics, Field Robots
Abstract: Terrestrial mobile robots face diverse topogra- phies while in field missions. Rough terrains cause the platform to oscillate, which is undesirable for some tasks. Robotic platforms with active tracked flippers can use such mechanisms to reach and maintain a leveled configuration while halted or moving. Thus, this work presents a posture controller that regulates the robot�s orientation and contact plane clearance using flippers while the robot moves over unknown, uneven ground. The method takes as input the flippers� joint position, torque, and the robot chassis orientation, outputting as the command signal the flippers� joint velocities. Based on Stewart platforms, a differential kinematics model relates desired plat- form�s motion to flippers� frame velocities. Later, a flippers- ground interaction model transforms their frames� computed velocities to flippers� joint speed commands. The controller is based on dual-quaternion algebra for generating the error signal. The efficacy of the proposed controller is evaluated experimentally in an industrial robotic platform moving as it moves along an open field track. The method successfully regulates the robot�s posture while navigating over non-modeled rough terrain.
|
| |
| 14:24-14:30, Paper MoBT4.5 | Add to My Program |
| Exploring Learning-Based Control Policy for Fish-Like Robots in Altered Background Flows |
|
| Lin, Xiaozhu | ShanghaiTech University |
| Song, Wenbin | ShanghaiTech University |
| Liu, Xiaopei | SHANGHAITECH UNIVERSITY |
| He, Xuming | ShanghaiTech University |
| Wang, Yang | Shanghaitech University |
Keywords: Motion Control, Biologically-Inspired Robots, Reinforcement Learning
Abstract: The study of motion control for the fish-like robots in complex fluid fields is of great importance in improving the performance of underwater vehicles, due to its strong maneuverability, propulsion efficiency, and deceptive visual appearance. In this article, a novel learning-based control framework is first proposed to autonomously explore efficient control policies that are capable of performing motion control tasks in non-quiescent and unknown background flows. First, we utilize a high-fidelity simulation system, named FishGym, to generate various uniform flows. Next, a DRL-based algorithm is incorporated with the FishGym to train the fish-like robot to control its motion to optimally complete a delicately designed task (Approaching Target and Stay) in both quiescent and uniform flow. Then, the obtained control policy together with an online estimator is directly applied to a Path-Following Task. The proposed framework well balances the simulation accuracy and the computational efficiency, which is of crucial importance for effective coupling with the learning algorithm. The simulation results indicate that, via the proposed learning framework, the robot successfully acquired a swimming strategy that can be used to adapt to different background flows and tasks. Furthermore, we also observe some adaptation behavior of the robot, such as rheotaxis, that is similar to the fish in nature, which gains us more insight into the mechanism underlying the adaptation behavior of fish in a complex environment.
|
| |
| 14:30-14:36, Paper MoBT4.6 | Add to My Program |
| On the Design of Region-Avoiding Metrics for Collision-Safe Motion Generation on Riemannian Manifolds |
|
| Klein, Holger | Karlsruhe Institute of Technology |
| Jaquier, No�mie | Karlsruhe Institute of Technology |
| Meixner, Andre | Karlsruhe Institute of Technology (KIT) |
| Asfour, Tamim | Karlsruhe Institute of Technology (KIT) |
Keywords: Motion Control, Human and Humanoid Motion Analysis and Synthesis, Dynamics
Abstract: The generation of energy-efficient and dynamic-aware robot motions that satisfy constraints such as joint limits, self-collisions, and collisions with the environment remains a challenge. In this context, Riemannian geometry offers promising solutions by identifying robot motions with geodesics on the so-called configuration space manifold. While this manifold naturally considers the intrinsic robot dynamics, constraints such as joint limits, self-collisions, and collisions with the environment remain overlooked. In this paper, we propose a modification of the Riemannian metric of the configuration space manifold allowing for the generation of robot motions as geodesics that efficiently avoid given regions. We introduce a class of Riemannian metrics based on barrier functions that guarantee strict region avoidance by systematically generating accelerations away from no-go regions in joint and task space. We evaluate the proposed Riemannian metric to generate energy-efficient, dynamic-aware, and collision-free motions of a humanoid robot as geodesics and sequences thereof.
|
| |
| 14:36-14:42, Paper MoBT4.7 | Add to My Program |
| Towards Connecting Control to Perception: High-Performance Whole-Body Collision Avoidance Using Control-Compatible Obstacles |
|
| Eckhoff, Moritz | Technical University of Munich (TUM) |
| Knobbe, Dennis | Technical University of Munich (TUM) |
| Zwirnmann, Henning | Technical University of Munich |
| Swikir, Abdalla | Technical University of Munich |
| Haddadin, Sami | Technical University of Munich |
Keywords: Whole-Body Motion Planning and Control, Force Control, Multi-Modal Perception for HRI
Abstract: One of the most important aspects of autonomous systems is safety. This includes ensuring safe human-robot and safe robot-environment interaction when autonomously performing complex tasks or in collaborative scenarios. Although several methods have been introduced to tackle this, most are unsuitable for real-time applications and require carefully hand-crafted obstacle descriptions. In this work, we propose a method combining high-frequency and real-time self and environment collision avoidance of a robotic manipulator with low-frequency, multimodal, and high-resolution environmental perceptions accumulated in a digital twin system. Our method is based on geometric primitives, so-called primitive skeletons. These, in turn, are information-compressed and real-time compatible digital representations of the robot's body and environment, automatically generated from ultra-realistic virtual replicas of the real world provided by the digital twin. Our approach is a key enabler for closing the loop between environment perception and robot control by providing the millisecond real-time control stage with a current and accurate world description, empowering it to react to environmental changes. We evaluate our whole-body collision avoidance on a 9-DOFs robot system through five experiments, demonstrating the functionality and efficiency of our framework.
|
| |
| 14:42-14:48, Paper MoBT4.8 | Add to My Program |
| Real-Time Whole-Body Collision Avoidance and Path Following of a Snake Robot through MPC-Based Optimization Strategies |
|
| Wang, Liuyin | University of Shanghai for Science and Technology |
| Wang, Gang | University of Nevada |
| Li, Yuan | University of Shanghai for Science and Technology |
| Li, Peng | Harbin Institute of Technology ShenZhen |
| Ji, Yunfeng | University of Shanghai for Science and Technology |
| Wang, Chaoli | University of Shanghai for Science and Technology |
| Shen, Yantao | University of Nevada, Reno |
Keywords: Redundant Robots, Motion Control, Actuation and Joint Mechanisms
Abstract: The work in this paper delves into the challenge of whole elongated body�s obstacle avoidance during path following for a class of bionic snake robots. Currently, most studies focus solely on preventing the robot�s head from colliding with obstacles through designed controllers. However, due to the unique elongated structure and biomimetic locomotion modes of snake robots, it is unavoidable that the rest of the robot�s body could still collide with obstacles. To resolve this problem, we propose a novel real-time optimization obstacle avoidance strategy for a class of terrestrial snake robots with multi-link elongated body using model predictive control (MPC). Moreover, by leveraging the elongated body characteristics of the robot, an improved path guidance strategy is also developed. The effectiveness of the proposed strategies is verified and validated through extensive simulations and experiments on a custom-built nine-link elongated snake robot. The results demonstrate that all links of the robot can well avoid obstacles while continuing to track the given path.
|
| |
| 14:48-14:54, Paper MoBT4.9 | Add to My Program |
| Safety-Critical Coordination for Cooperative Legged Locomotion Via Control Barrier Functions |
|
| Kim, Jeeseop | Caltech |
| Lee, Jaemin | California Institute of Technology |
| Ames, Aaron | Caltech |
Keywords: Motion Control, Legged Robots, Robot Safety
Abstract: This paper presents a safety-critical approach to the coordinated control of cooperative robots locomoting in the presence of fixed (holonomic) constraints. To this end, we leverage control barrier functions (CBFs) to ensure the safe cooperation of the robots while maintaining a desired formation and avoiding obstacles. The top-level planner generates a set of feasible trajectories, accounting for both kinematic constraints between the robots and physical constraints of the environment. This planner leverages CBFs to ensure safety-critical coordination control, i.e., guarantee safety of the collaborative robots during locomotion. The middle-level trajectory planner incorporates interconnected single rigid body (SRB) dynamics to generate optimal ground reaction forces (GRFs) to track the safety-ensured trajectories from the top-level planner while addressing the interconnection dynamics between agents. Distributed low-level controllers generate whole-body motion to follow the prescribed optimal GRFs while ensuring the friction cone condition at each end of the stance legs. The effectiveness of the approach is demonstrated through numerical simulations and experimentally on a pair of quadrupedal robots.
|
| |
| 14:54-15:00, Paper MoBT4.10 | Add to My Program |
| Staged Contact Optimization: Combining Contact-Implicit and Multi-Phase Hybrid Trajectory Optimization |
|
| Turski, Michael R. | Carnegie Mellon University |
| Norby, Joseph | Apptronik |
| Johnson, Aaron M. | Carnegie Mellon University |
Keywords: Multi-Contact Whole-Body Motion Planning and Control, Legged Robots, Optimization and Optimal Control
Abstract: Trajectory optimization problems for legged robots are commonly formulated with fixed contact schedules. These multi-phase Hybrid Trajectory Optimization (HTO) methods result in locally optimal trajectories, but the result depends heavily upon the predefined contact mode sequence. Contact-Implicit Optimization (CIO) offers a potential solution to this issue by allowing the contact mode to be determined throughout the trajectory by the optimization solver. However, CIO suffers from long solve times and convergence issues. This work combines the benefits of these two methods into one algorithm: Staged Contact Optimization (SCO). SCO tightens constraints on contact in stages, eventually fixing them to allow robust and fast convergence to a feasible solution. Results on a planar biped and spatial quadruped demonstrate speed and optimality improvements over CIO and HTO. These properties make SCO well suited for offline trajectory generation or as an effective tool for exploring the dynamic capabilities of a robot.
|
| |
| 15:00-15:06, Paper MoBT4.11 | Add to My Program |
| Hierarchical Relaxation of Safety-Critical Controllers: Mitigating Contradictory Safety Conditions with Application to Quadruped Robots |
|
| Lee, Jaemin | California Institute of Technology |
| Kim, Jeeseop | Caltech |
| Ames, Aaron | Caltech |
Keywords: Motion Control, Legged Robots, Robot Safety
Abstract: The safety-critical control of robotic systems often must account for multiple, potentially conflicting, safety constraints. This paper proposes novel relaxation techniques to address safety-critical control problems in the presence of conflicting safety conditions. In particular, Control Barrier Functions (CBFs) provide a means to encode safety as constraints in a Quadratic Program (QP), wherein multiple safety conditions yield multiple constraints. However, the QP problem becomes infeasible when the safety conditions cannot be simultaneously satisfied. To resolve this potential infeasibility, we introduce a hierarchy between the safety conditions and employ an additional variable to relax the less important safety conditions (Relaxed-CBF-QP). We also formulate a cascaded structure to achieve smaller violations of lower-priority safety conditions (Hierarchical-CBF-QP). The proposed approach, therefore, ensures the existence of at least one solution to the QP problem with the CBFs while dynamically balancing enforcement of additional safety constraints. Importantly, this paper evaluates the impact of different weighting factors in the Hierarchical-CBF-QP and, due to the sensitivity of these weightings in the observed behavior, proposes a method to determine the weighting factors via a sampling-based technique. The validity of the proposed approach is demonstrated through simulations and experiments on a quadrupedal robot navigating to a goal through regions with different levels of danger.
|
| |
| 15:06-15:12, Paper MoBT4.12 | Add to My Program |
| A Recursive Lie-Group Formulation for the Second-Order Time Derivatives of the Inverse Dynamics of Parallel Kinematic Manipulators |
|
| Mueller, Andreas | Johannes Kepler University |
| Kumar, Shivesh | DFKI GmbH |
| Kordik, Thomas | Johannes Kepler University, Institute of Robotics |
Keywords: Parallel Robots, Compliant Joints and Mechanisms, Motion Control
Abstract: Series elastic actuators (SEA) were introduced for serial robotic arms. Their model-based trajectory tracking control requires the second time derivatives of the inverse dynamics solution, for which algorithms were proposed. Trajectory control of parallel kinematics manipulators (PKM) equipped with SEAs has not yet been pursued. Key element for this is the computationally efficient evaluation of the second time derivative of the inverse dynamics solution. This has not been presented in the literature, and is addressed in the present paper for the first time. The special topology of PKM is exploited reusing the recursive algorithms for evaluating the inverse dynamics of serial robots. A Lie group formulation is used and all relations are derived within this framework. Numerical results are presented for a 6-DOF Gough-Stewart platform (as part of an exoskeleton), and for a planar PKM when a flatness-based control scheme is applied.
|
| |
| 15:12-15:18, Paper MoBT4.13 | Add to My Program |
| Manipulator Differential Kinematics Part I: Kinematics, Velocity, and Applications (I) |
|
| Haviland, Jesse | Queensland University of Technology |
| Corke, Peter | Queensland University of Technology |
Keywords: Kinematics, Motion Control, Manipulation Planning
Abstract: Manipulator kinematics is concerned with the motion of each link within a manipulator without considering mass or force. In this article, which is the first in a two-part tutorial, we provide an introduction to modelling manipulator kinematics using the elementary transform sequence (ETS). Then we formulate the first-order differential kinematics, which leads to the manipulator Jacobian, which is the basis for velocity control and inverse kinematics. We describe essential classical techniques which rely on the manipulator Jacobian before exhibiting some contemporary applications. Part II of this tutorial provides a formulation of second and higher-order differential kinematics, introduces the manipulator Hessian, and illustrates advanced techniques, some of which improve the performance of techniques demonstrated in Part I.
|
| |
| MoBT5 Regular session, 140E |
Add to My Program |
| Mechanism Design II |
|
| |
| Chair: Cho, Kyu-Jin | Seoul National University, Biorobotics Laboratory |
| Co-Chair: Suh, Jungwook | Kyungpook National University (KNU) |
| |
| 14:00-14:06, Paper MoBT5.1 | Add to My Program |
| Compliant Suction Gripper with Seamless Deployment and Retraction for Robust Picking against Depth and Tilt Errors |
|
| Yoo, Yuna | Seoul National University |
| Eom, Jaemin | Seoul National University Biorobotics Lab |
| Park, Min Jo | Seoul National University |
| Cho, Kyu-Jin | Seoul National University, Biorobotics Laboratory |
Keywords: Mechanism Design, Soft Robot Applications, Grippers and Other End-Effectors
Abstract: Applying suction grippers in unstructured environments is a challenging task because of depth and tilt errors in vision systems, requiring additional costs in elaborate sensing and control. To reduce additional costs, suction grippers with compliant bodies or mechanisms have been proposed; however, their bulkiness and limited allowable error hinder their use in complex environments with large errors. Here, we propose a compact suction gripper that can pick objects over a wide range of distances and tilt angles without elaborate sensing and control. The spring-inserted gripper body deploys and conforms to distant and tilted objects until the suction cup completely seals with the object and retracts immediately after, while holding the object. This seamless deployment and retraction is enabled by connecting the gripper body and suction cup to the same vacuum source, which couples the vacuum picking and retraction of the gripper body. Experimental results validated that the proposed gripper can pick objects within 79 mm, which is 1.4 times the initial length, and can pick objects with tilt angles up to 60�. The feasibility of the gripper was verified by demonstrations, including picking objects of different heights from the same picking height and the bin picking of transparent objects.
|
| |
| 14:06-14:12, Paper MoBT5.2 | Add to My Program |
| Design of Novel Knee Joint Mechanism of Lower-Limb Exoskeleton to Realize Spatial Motion of Human Knee |
|
| Hong, Man Bok | Agency for Defense Development |
| Kim, Yongcheol | Agency for Defense Development |
| Kim, Gwang Tae | Agency for Defense Development |
| Lee, Myunghyun | Agency for Defense Development |
| Kim, Seonwoo | Agency for Defense Development |
Keywords: Kinematics, Prosthetics and Exoskeletons, Mechanism Design
Abstract: The rotation axis of human knee joint varies according to knee flexion angles. That is, human knee movement is spatial with dominant flexional rotation. Knee joints of most lower-limb exoskeletons were, however, realized with a simple revolute pair for design simplicity. Wearing the knee joint with a simple revolute pair constrains inevitably natural parasitic motion of human knee joint. Rigid constraints imposed due to the simple knee joint lowers wearability and comfortability. In addition, it may act as potential risks to harm the knee joint structure of the wearer. The concept of polycentric knee is a well-known approach to mimic the variation of knee rotation center. Polycentric knees, however, realize only the planar trace of projected points of rotation axis. That is, parasitic knee rotations of varus and internal rotation occuring naturally during knee flexion cannot be realized by polycentrc knees. In order to resolve this, a novel spherical knee joint of exoskeleton is introduced in this paper to realize knee spherical movements. For the design, change in instantaneous rotation axes during knee flexion is formulated to a spherical trace as a function of knee flexion angles. A spherical four-bar linkage is suggested to realize the axis trajectory, not the point trace. A method to find instantaneous rotation axis and rotation matrix of coupler link is derived, when the angle of input link is given. Using the kinematic relations, kinematic parameters of the mechanism are optimized to minimize angle difference between the instantaneous axis of coupler and the required axis of knee rotation. Finally, based on the synthesized kinematic parameters, a prototype design was introduced in this paper.
|
| |
| 14:12-14:18, Paper MoBT5.3 | Add to My Program |
| A Novel Coiled Cable-Conduit-Driven Hyper-Redundant Manipulator for Remote Operating in Narrow Spaces |
|
| Luo, Mingrui | Institute of Automation, Chinese Academy of Sciences |
| Tian, Yunong | Institute of Automation, Chinese Academy of Sciences |
| Li, En | Institute of Automation, Chinese Academy of Sciences |
| Chen, Minghao | Institute of Automation, Chinese Academy of Sciences |
| Kang, Cunfeng | Beijing University of Technology |
| Yang, Guodong | Institute of Automation, Chinese Academy of Sciences |
| Tan, Min | Institute of Automation, Chinese Academy of Sciences |
Keywords: Redundant Robots, Tendon/Wire Mechanism, Telerobotics and Teleoperation
Abstract: Operating in narrow spaces is an important challenge in the development of robots. Redundant manipulators are one way to solve this problem, but their mechanism design and control method still have much room for improvement. In this paper, we propose a coiled cable-conduit-driven hyper-redundant manipulator (C-CDHRM) with great slenderness and flexibility. In terms of mechanism design, it considers both compactness and operability. By imitating the structure and behavior of a constricting snake, it can be uncoiled sequentially from a coiled storage state, led by the head. In terms of control methods, we propose a multi-layer control system that can make remote operations more accurate and reliable. On the one hand, guiding, segmenting, and following the path overcome the planning ambiguity caused by redundancy. On the other hand, conduit transmission modeling and cable length correction overcome the nonlinear mapping of cable-driven joints and were verified in experiments. Through tests, the mobile integrated system composed of C-CDHRM has an excellent performance in operation precision and accuracy, ensuring safety and accessibility in narrow spaces. Finally, in field experiments, the inspection and cleaning of various types of electrical equipment have been successfully completed, showing excellent application prospects.
|
| |
| 14:18-14:24, Paper MoBT5.4 | Add to My Program |
| Design and Testing of a Flexure-Based XYZ Micropositioner with High Space-Utilization Efficiency |
|
| Lyu, Zekui | University of Macau |
| Xu, Qingsong | University of Macau |
Keywords: Compliant Joints and Mechanisms, Grippers and Other End-Effectors, Automation at Micro-Nano Scales
Abstract: The flexure-based XYZ micropositioner with hybrid configuration has become more prevalent due to the characteristics of less mechanism decoupling and high motion precision. However, traditional mechanism design suffers from a large plane occupation with Z stage stacking, which leads to a low space-utilization efficiency. To address this issue, a novel conceptual design is proposed in this paper by integrating a spatially structured XY stage and an embedded Z stage together. After completing the design of the mechanism, the driving stiffness of the stage in three axes is evaluated by the mechanics analysis. Then, the model is verified by performing finite element analysis simulation study and experimental test. The theoretical model, simulation results, and experimental data indicate a good agreement. Experimental results show that the proposed flexure-based XYZ micropositioner can deliver a stroke of 4.15 mm x 4.06 mm x 0.04 mm with a physical size of 116 mm x 116 mm x 45 mm. The performance comparison reveals that it has a superior space-utilization efficiency. In consideration of the feasibility of the proposed conceptual design, it provides a reference for diversified and refined design of XYZ micropositioners.
|
| |
| 14:24-14:30, Paper MoBT5.5 | Add to My Program |
| Design and Development of a Deformable In-Pipe Inspection Robot for Various Diameter Pipes |
|
| Xu, Huafeng | The Hong Kong Polytechnic University |
| Cao, Jiannong | The Hong Kong Polytechnic University |
| Cheng, Zhiqin | The Hong Kong Polytechnic University |
| Liang, Zhixuan | The Hong Kong Polytechnic University |
| Chen, Jinlin | Hong Kong Polytechnic University |
Keywords: Wheeled Robots, Mechanism Design, Kinematics
Abstract: Pipelines have become one of the most important infrastructures in the city. Over time, they are prone to aging, cracks, corrosion, and the need for regular inspection is gradually increasing. Robotic solutions are effective methods for in-pipe inspection. However, existing In-pipe Inspection Robots (IPIR) require that the inner diameter of the pipe is fixed in the application scenarios, and need extra labor to control the robot and handle the cable. In this work, we design and develop a deformable robot to adapt to pipes with different inner diameters. Specifically, the passive elastic hinge is used by us to make the robot fully in contact with the pipe, generating enough friction to ensure that the robot is attached to the inner wall of the pipe. An edge device is deployed on the robot, generating velocity commands of wheels through the data from Inertial Measurement Unit (IMU), which eliminates the need for external devices. Experimental results demonstrate that the robot can move in horizontal and vertical pipelines, as well as traverse through pipe joints and scenarios where there is dirty or small obstacle.
|
| |
| 14:30-14:36, Paper MoBT5.6 | Add to My Program |
| A Bioinspired Underactuated Dual Tendon-Based Adaptive Gripper for Space Applications |
|
| Isakhani, Hamid | University of Birmingham |
| Nefti-Meziani, Samia | University of Salford |
| Davis, Steven | University of Birmingham |
| Isakhani, Helya | Rebelya LTD |
Keywords: Tendon/Wire Mechanism, Space Robotics and Automation, Additive Manufacturing
Abstract: Hands are one of the most intricate elements of a humanoid due to their role as end-effectors interacting with their non-linear surrounding environment. This paper aims to present the design of a bioinspired underactuated robotic hand with an improved dexterity that is capable of adaptive grasping and manipulation of a wide-range of objects using a dual-tendon mechanism. The proposed design is focused on the key elements of scalability, modularity, ease of fabrication and cost efficiency to meet several imperative constraints of space applications. These features are achieved by introducing a novel actuation mechanism, manufacturing methods, and component design. In particular, monolithic finger modules are fabricated by fusing and integrating both hard and soft materials analogous to bones wrapped in muscles using economical and readily-available materials and machines (intermediate 3D printer). Weight-to-power ratio, actuation optimisation, design trade-offs, and various potential applications of the proposed adaptive hand is discussed in this paper. Furthermore, the prototype is subjected to evaluation of its performance in different scenarios that ultimately confirms its improved dexterity and gripping power compared to the literature.
|
| |
| 14:36-14:42, Paper MoBT5.7 | Add to My Program |
| Parallel-Jaw Gripper and Grasp Co-Optimization for Sets of Planar Objects |
|
| Jiang, Rebecca H. | Massachusetts Institute of Technology |
| Doshi, Neel | MIT |
| Gondhalekar, Ravi | The Charles Stark Draper Laboratory |
| Rodriguez, Alberto | Massachusetts Institute of Technology |
Keywords: Grippers and Other End-Effectors, Grasping, Methods and Tools for Robot System Design
Abstract: We propose a framework for optimizing a planar parallel-jaw gripper for use with multiple objects. While optimizing general-purpose grippers and contact locations for grasps are both well studied, co-optimizing grasps and the gripper geometry to execute them receives less attention. As such, our framework synthesizes grippers optimized to stably grasp sets of polygonal objects. Given a fixed number of contacts and their assignments to object faces and gripper jaws, our framework optimizes contact locations along these faces, gripper pose for each grasp, and gripper shape. Our key insights are to pose shape and contact constraints in frames fixed to the gripper jaws, and to leverage the linearity of constraints in our grasp stability and gripper shape models via an augmented Lagrangian formulation. Together, these enable a tractable nonlinear program implementation. We apply our method to several examples. The first illustrative problem shows the discovery of a geometrically simple solution where possible. In another, space is constrained, forcing multiple objects to be contacted by the same features as each other. Finally a toolset-grasping example shows that our framework applies to complex, real-world objects. We provide a physical experiment of the toolset grasps.
|
| |
| 14:42-14:48, Paper MoBT5.8 | Add to My Program |
| Inertial Propulsion Robot Usingthe Shape Characteristics of a Streamlined Body Frame |
|
| Nishihara, Masatsugu | JAIST |
| Asano, Fumihiko | Japan Advanced Institute of Science and Technology |
Keywords: Underactuated Robots, Dynamics, Industrial Robots
Abstract: We have been investigating a crawling-like locomotion robot to make it efficiently slide forward based on a simple system and control mechanisms on a slippery level surface, where the motion of the center of mass plays an important role. In this paper, we induce an effective motion of the center of mass considering a streamlined body shape of a locomotion robot in which a pendulum is installed. First, we derive the equation of motion and a control input to achieve a desired motion of the inner pendulum. Second, we formulate constraint conditions between the streamlined body and a slippery floor. Third, we demonstrate the numerical simulation, and the robot steadily slides forward by adopting the streamlined shape as the body frame. Fourth, we verify the numerical results through experiments, and the experimental results exhibit a similar tendency compared with the numerical results. Fifth, we find a local minimum value of locomotion efficiency based on Bayesian optimization which is a class of machine-learning-based optimization, and we achieve exceedingly efficient locomotion of the robot on the slippery floor at the local minimum in both the simulation and experiment.
|
| |
| 14:48-14:54, Paper MoBT5.9 | Add to My Program |
| Two-Stage Trajectory-Tracking Control of Cable-Driven Upper-Limb Exoskeleton Robots with Series Elastic Actuators: A Simple, Accurate, and Force-Sensorless Method |
|
| Shu, Yana | Tsinghua University |
| Chen, Yu | Tsinghua University |
| Zhang, Xuan | Tsinghua University |
| Zhang, Shisheng | Shenyang Jianzhu University |
| Chen, Gong | Shenzhen MileBot Robotics |
| Ye, Jing | Shenzhen MileBot Robotics Co. Ltd |
| Li, Xiang | Tsinghua University |
Keywords: Actuation and Joint Mechanisms, Motion Control, Prosthetics and Exoskeletons
Abstract: The advantages of cable-driven exoskeleton robots with series elastic actuators can be summarized in twofold: 1) the inertia of the robot joint is relatively low, which is more friendly for human-robot interaction; 2) the elastic element is tolerant to impacts and hence provides structural safety. As trade-offs, the overall dynamic model of such a system is of high order and subject to both unmodelled disturbances (due to the cable-driven mechanism) and external torques (due to the human-robot interaction), opening up challenges for the controller development. This paper proposes a new trajectory-tracking control scheme for cable-driven upper-limb exoskeleton robots with series elastic actuators. The control objectives are achieved in two stages: Stage I is to approximate then compensate for unmodelled disturbances with iterative learning techniques; Stage II is to employ a suboptimal model predictive controller to drive the robot to track the desired trajectory. While controlling such a robot is not trivial, the proposed control scheme exhibits the advantages of force-sensorlessness, high accuracy, and low complexity compared with other methods in the real-world experiments.
|
| |
| 14:54-15:00, Paper MoBT5.10 | Add to My Program |
| A Retractable Soft Growing Robot with a Flexible Backbone |
|
| Pi, Xinyi | University of Sheffield |
| Szczech, Isabella Ann | The University of Sheffield |
| Cao, Lin | University of Sheffield |
Keywords: Mechanism Design, Soft Robot Materials and Design, Soft Sensors and Actuators
Abstract: Soft-growing robots are emerging with numerous potential applications because of their superior capability of frictionless navigation. However, their success is hindered by their tendency to buckle under the tension required to retract them via inversion. In this paper, we propose a simple and scalable tubular backbone to facilitate retracting the robot body without buckling. With this backbone, compressive forces at the robot's tip are mitigated and a limit is placed on the effective length for retraction during the application of tension. We first present the selection of the backbone and the development of such a retractable soft-growing robot. Along with the characterization of the working principles behind this buckling-free mechanism, success was observed with the use of the backbone in retraction tests. The effects of different parameters such as robot body lengths, air pressures, curvatures, and retraction modes on the performance were also investigated. This backbone approach requires no bulky or in-situ mechatronic components inside the robot body and thus may be used in medical applications which appreciate simple, compact, and in-situ electronic-free designs.
|
| |
| 15:00-15:06, Paper MoBT5.11 | Add to My Program |
| CurveQuad: A Centimeter-Scale Origami Quadruped That Leverages Curved Creases to Self-Fold and Crawl with One Motor |
|
| Feshbach, Daniel | University of Pennsylvania |
| Wu, Xuelin | University of Pennsylvania |
| Vasireddy, Satviki | Princeton Day School |
| Beardell, Louis | Episcopal Academy |
| To, Bao | Peddie School |
| Baryshnikov, Yuliy | UIUC |
| Sung, Cynthia | University of Pennsylvania |
Keywords: Underactuated Robots, Soft Robot Materials and Design, Flexible Robotics
Abstract: We present CurveQuad, a miniature curved origami quadruped that is able to self-fold and unfold, crawl, and steer, all using a single actuator. CurveQuad is designed for planar manufacturing, with parts that attach and stack sequentially on a flat body. The design uses 4 curved creases pulled by 2 pairs of tendons from opposite ends of a link on a 270� servo. It is 8 cm in the longest direction and weighs 10.9 g. Rotating the horn pulls the tendons inwards to induce folding. Continuing to rotate the horn shears the robot, enabling the robot to shuffle forward while turning in either direction. We experimentally validate the robot's ability to fold, steer, and unfold by changing the magnitude of horn rotation. We also demonstrate basic feedback control by steering towards a light source from a variety of starting positions and orientations, and swarm aggregation by having 4 robots simultaneously steer towards the light. The results demonstrate the potential of using curved crease origami in self-assembling and deployable robots with complex motions such as locomotion.
|
| |
| 15:06-15:12, Paper MoBT5.12 | Add to My Program |
| A Pendulum-Driven Legless Rolling Jumping Robot |
|
| Buzhardt, Jake | Clemson University |
| Chivkula, Prashanth | Clemson University |
| Tallapragada, Phanindra | Clemson University |
Keywords: Underactuated Robots, Dynamics, Passive Walking
Abstract: In this paper, we present a novel rolling, jumping robot. The robot consists of a driven pendulum mounted to a wheel in a compact, lightweight, 3D printed design. We show that by driving the pendulum to shift the robot's weight distribution, the robot is able to obtain significant rolling speed, achieve jumps of up to 2.5 body lengths vertically, and clear horizontal distances of over 6 body lengths. The robot's dynamic model is derived and simulation results indicate that it is consistent with the rolling motion and jumping observed on the robot. The ability to both roll and jump effectively using a minimalistic design makes this robot unique and could inspire the use of similar mechanisms on robots intended for applications in which agile locomotion on unstructured terrain is necessary, such as disaster response or planetary exploration.
|
| |
| 15:12-15:18, Paper MoBT5.13 | Add to My Program |
| AcroMonk: A Minimalist Underactuated Brachiating Robot |
|
| Javadi, Mahdi | German Research Center for Artificial Intelligence Robotics Inn |
| Harnack, Daniel | Deutsches Forschungszentrum F�r K�nstliche Intelligenz |
| Stocco, Paula | Stanford University |
| Kumar, Shivesh | DFKI GmbH |
| Vyas, Shubham | Robotics Innovation Center, DFKI GmbH |
| Pizzutilo, Daniel | DFKI RIC |
| Kirchner, Frank | University of Bremen |
Keywords: Underactuated Robots, Biologically-Inspired Robots, Education Robotics
Abstract: Brachiation is a dynamic, coordinated swinging maneuver of body and arms used by monkeys and apes to move between branches. As a unique underactuated mode of locomotion, it is interesting to study from a robotics perspective since it can broaden the deployment scenarios for humanoids and animaloids. While several brachiating robots of varying complexity have been proposed in the past, this paper presents the simplest possible prototype of a brachiation robot, using only a single actuator and unactuated grippers. The novel passive gripper design allows it to snap on and release from monkey bars, while guaranteeing well defined start and end poses of the swing. The brachiation behavior is realized in three different ways, using trajectory optimization via direct collocation and stabilization by a model-based time-varying linear quadratic regulator (TVLQR) or model-free proportional derivative (PD) control, as well as by a reinforcement learning (RL) based control policy. The three control schemes are compared in terms of robustness to disturbances, mass uncertainty, and energy consumption. The system design and controllers have been open-sourced. Due to its minimal and open design, the system can serve as a canonical underactuated platform for education and research.
|
| |
| 15:18-15:24, Paper MoBT5.14 | Add to My Program |
| Design and Verification of Parallelogram Mechanism with Geared Unit Rolling Joints for Reliable Wiring |
|
| Suh, Jungwook | Kyungpook National University (KNU) |
| Choi, Wontae | Kyungpook National University (KNU) |
Keywords: Mechanism Design, Tendon/Wire Mechanism, Kinematics
Abstract: The structure of 1-DOF joints used in existing robots is generally a revolute joint or a prismatic joint. However, recently, attempts have been made to apply rolling joints to reduce the size and weight of surgical and humanoid robots. In this study, to secure the advantages of wire routing through robot joints, a new method for applying geared rolling units to a parallelogram mechanism is proposed. First, a kinematic analysis of the proposed gear-based mechanism is explained in comparison with the existing pivot-based mechanism. In addition, the importance of the radii of the gears is verified through force analysis to prevent damage to the applied gears, as well as through the analysis of actuation torque and singular positions, in which the parallelogram can convert into an anti-parallelogram. The effect of stable wiring was verified through an experiment using a cable-driven prototype. Consequently, the proposed parallelogram composed of rolling units is expected to be applied to various robot configurations owing to its advantages.
|
| |
| MoBT6 Regular session, 140FG |
Add to My Program |
| Modeling, Control, and Learning for Soft Robots II |
|
| |
| Chair: Blumenschein, Laura | Purdue University |
| Co-Chair: Ozkan-Aydin, Yasemin | University of Notre Dame |
| |
| 14:00-14:06, Paper MoBT6.1 | Add to My Program |
| Vine Robot Localization Via Collision |
|
| Frias-Miranda, Eugenio | Purdue University |
| Srivastava, Alankriti | Purdue University |
| Wang, Sicheng | Purdue University |
| Blumenschein, Laura | Purdue University |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Robot Materials and Design, Localization
Abstract: Localization of robots is a complex task that is often hindered by the sensors these systems use. Due to the majority of field robots being rigid, most of these sensing modalities have the same common faults, such as performance being hindered when their camera vision is obscured. In addition, rigid systems lack flexibility when traversing multiple environments: especially when traversing uneven and unpredictable ground. Soft robots, which can adaptably interact with the environment, could serve as a solution to both problems. One specific soft robot, the Vine Robot, has exhibited excellent performance while moving through constrained, unpredictable environments. This makes the Vine Robot an ideal candidate for a novel method of sensing and localizing in environments, obstacle collision localization. We use our understanding of the nature of Vine Robot motion to be able to predict the tip position of the robot at every instant based on sensor feedback. Through the single obstacle experiments, it was found that our algorithm can provide a precise picture of the tip position of the robot in differing environments. Further, in a multi obstacle demonstration, less than 5 percent max error relative to the full robot length was observed on the path prediction. Our study helps lay the foundation for a new method for Vine Robot localization using contact as a new sensing modality.
|
| |
| 14:06-14:12, Paper MoBT6.2 | Add to My Program |
| Mapping Unknown Environments through Passive Deformation of Soft, Growing Robots |
|
| Fuentes, Francesco | Purdue University |
| Blumenschein, Laura | Purdue University |
Keywords: Modeling, Control, and Learning for Soft Robots, Mapping
Abstract: When faced with an unstructured environment filled with an unknown number and size of obstacles on a chaotic terrain, it can be a challenge to determine the best method of navigating and mapping the space. This problem, known as Simultaneous Localization and Mapping (SLAM), has typically been approached using vision-based solutions, but these solutions require clear visual conditions in order to function optimally. A different approach to sensing environments has been explored in soft robotic systems, specifically by sensing changes in the environment through sensing changes in the robot's configuration. Building on this idea, we introduce a method of mapping based on colliding with and deforming around obstacles using a soft, growing robot. Instead of avoiding obstacles, as is typically done to protect robots, we take advantage of the soft, growing robot's compliance in order to navigate through, and collect information about, the environment. Through the construction and testing of a geometry-based simulation, we analyzed the behavior and effectiveness of this approach for mapping by generating random launch positions and collecting information from contacted obstacles and traversed regions. Through a plethora of randomly generated environments, we determine that: 1) the density of obstacles in an environment has minimal impact on mapping abilities and 2) at least 70% of each environment tested can be mapped by deploying 20 or fewer soft, growing robots.
|
| |
| 14:12-14:18, Paper MoBT6.3 | Add to My Program |
| Stable Real-Time Feedback Control of a Pneumatic Soft Robot |
|
| Even, Sean | University of Notre Dame |
| Zheng, Tongjia | University of Notre Dame |
| Lin, Hai | University of Notre Dame |
| Ozkan-Aydin, Yasemin | University of Notre Dame |
Keywords: Modeling, Control, and Learning for Soft Robots, Biomimetics, Soft Robot Applications
Abstract: Soft actuators offer compliant and safe interaction with an unstructured environment compared to their rigid counterparts. However, control of these systems is often challenging because they are inherently under-actuated, have infinite degrees of freedom (DoF), and their mechanical properties can change by unknown external loads. Existing works mainly relied on discretization and reduction, suffering from either low accuracy or high computational cost for real-time control purposes. Recently, we presented an infinite-dimensional feedback controller for soft manipulators modeled by partial differential equations (PDEs) based on the Cosserat rod theory. In this study, we examine how to implement this controller in real-time using only a limited number of actuators. To do so, we formulate a convex quadratic programming problem that tunes the feedback gains of the controller in real-time such that it becomes realizable by the actuators. We evaluated the controller's performance through experiments on a physical soft robot capable of planar motions and show that the actual controller implemented by the finite-dimensional actuators still preserves the stabilizing property of the desired infinite-dimensional controller. This research fills the gap between the infinite-dimensional control design and finite-dimensional actuation in practice and suggests a promising direction for exploring PDE-based control design for soft robots.
|
| |
| 14:18-14:24, Paper MoBT6.4 | Add to My Program |
| Real2Sim2Real Transfer for Control of Cable-Driven Robots Via a Differentiable Physics Engine |
|
| Wang, Kun | Amazon.com LLC |
| Johnson, William | Yale University |
| Lu, Shiyang | Rutgers University |
| Huang, Xiaonan | University of Michigan |
| Booth, Joran | Yale University |
| Kramer-Bottiglio, Rebecca | Yale University |
| Aanjaneya, Mridul | Rutgers University |
| Bekris, Kostas E. | Rutgers, the State University of New Jersey |
Keywords: Modeling, Control, and Learning for Soft Robots, Simulation and Animation, Model Learning for Control
Abstract: Tensegrity robots, composed of rigid rods and flexible cables, exhibit high strength-to-weight ratios and significant deformations, which enable them to navigate unstructured terrains and survive harsh impacts. They are hard to control, however, due to high dimensionality, complex dynamics, and a coupled architecture. Physics-based simulation is a promising avenue for developing locomotion policies that can be transferred to real robots. Nevertheless, modeling tensegrity robots is a complex task due to a substantial sim2real gap. To address this issue, this paper describes a Real2Sim2Real (R2S2R) strategy for tensegrity robots. This strategy is based on a differentiable physics engine that can be trained given limited data from a real robot. These data include offline measurements of physical properties, such as mass and geometry for various robot components, and the observation of a trajectory using a random control policy. With the data from the real robot, the engine can be iteratively refined and used to discover locomotion policies that are directly transferable to the real robot. Beyond the R2S2R pipeline, key contributions of this work include computing non-zero gradients at contact points, a loss function for matching tensegrity locomotion gaits, and a trajectory segmentation technique that avoids conflicts in gradient evaluation during training. Multiple iterations of the R2S2R process are demonstrated and evaluated on a real 3-bar tensegrity robot.
|
| |
| 14:24-14:30, Paper MoBT6.5 | Add to My Program |
| Multi-Gait Locomotion Planning and Tracking for Tendon-Actuated Terrestrial Soft Robot (TerreSoRo) |
|
| Mahendran, Arun Niddish | The University of Alabama, Tuscaloosa |
| Freeman, Caitlin | University of Alabama |
| Chang, Alexander | Georgia Institute of Technology |
| McDougall, Michael | University of Strathclyde Glasgow |
| Vikas, Vishesh | University of Alabama |
| Vela, Patricio | Georgia Institute of Technology |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Robot Applications, Motion and Path Planning
Abstract: The adaptability of soft robots makes them ideal candidates to maneuver through unstructured environments. However, locomotion challenges arise due to complexities in modeling the body mechanics, actuation, and robot-environment dynamics. These factors contribute to the gap between their potential and actual autonomous field deployment. A closed-loop path planning framework for soft robot locomotion is critical to close the real-world realization gap. This paper presents a generic path planning framework applied to TerreSoRo (Tetra-Limb Terrestrial Soft Robot) with pose feedback. It employs a gait-based, lattice trajectory planner to facilitate navigation in the presence of obstacles. The locomotion gaits are synthesized using a data-driven optimization approach that allows for learning from the environment. The trajectory planner employs a greedy breadth-first search strategy to obtain a collision-free trajectory. The synthesized trajectory is a sequence of rotate-then-translate gait pairs. The control architecture integrates high-level and low-level controllers with real-time localization (using an overhead webcam). TerreSoRo successfully navigates environments with obstacles where path re-planning is performed. To best of our knowledge, this is the first instance of real-time, closed-loop path planning of a non-pneumatic soft robot.
|
| |
| 14:30-14:36, Paper MoBT6.6 | Add to My Program |
| Learning Soft Robot Dynamics Using Differentiable Kalman Filters and Spatio-Temporal Embeddings |
|
| Liu, Xiao | Arizona State University |
| Ikemoto, Shuhei | Kyushu Institute of Technology |
| Yoshimitsu, Yuhei | Kyushu Institute of Technology |
| Ben Amor, Heni | Arizona State University |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Robot Applications, Soft Robot Materials and Design
Abstract: This paper introduces a novel approach for modeling the dynamics of soft robots, utilizing a differentiable filter architecture. The proposed approach enables end-to-end training to learn system dynamics, noise characteristics, and temporal behavior of the robot. A novel spatio-temporal embedding process is discussed to handle observations with varying sensor placements and sampling frequencies. The efficacy of this approach is demonstrated on a tensegrity robot arm by learning end-effector dynamics from demonstrations with complex bending motions. The model is proven to be robust against missing modalities, diverse sensor placement, and varying sampling rates. Additionally, the proposed framework is shown to identify physical interactions with humans during motion. The utilization of a differentiable filter presents a novel solution to the difficulties of modeling soft robot dynamics. Our approach shows substantial improvement in accuracy compared to state-of-the-art filtering methods, with at least a 24% reduction in mean absolute error (MAE) observed. Furthermore, the predicted end-effector positions show an average MAE of 25.77mm from the ground truth, highlighting the advantage of our approach. The code is available at https: //github.com/ir-lab/soft_robot_DEnKF.
|
| |
| 14:36-14:42, Paper MoBT6.7 | Add to My Program |
| Closed Loop Control of Tendon Driven Continuum Robots Using IMUs |
|
| Srivastava, Manu | Clemson University |
| Groff, Richard | Clemson University |
| Walker, Ian | Clemson University |
Keywords: Modeling, Control, and Learning for Soft Robots, Sensor-based Control, Biomimetics
Abstract: In this paper, we present a new approach to the control of continuum robot sections using IMU quaternion feedback. We use a discrete time root finding algorithm to drive a continuum section in the desired shape space direction. We found that the approach lacks end effector positioning accuracy when used by itself, however, when used in conjunction with a feedforward model it actively counters the influence of unmodeled factors. The approach is implemented on a single section of a continuum hose robot developed for 3D printing of concrete in construction applications. The results demonstrate significant improvements in positioning accuracy compared to standalone kinematics/mechanics-based position control of tendon lengths. Additionally, this approach can be implemented using low cost sensing and control hardware. Index Terms�Continuum robot, IMU, tendons, control.
|
| |
| 14:42-14:48, Paper MoBT6.8 | Add to My Program |
| Machine Learning Best Practices for Soft Robot Proprioception |
|
| Zhang, Annan | Massachusetts Institute of Technology |
| Wang, Tsun-Hsuan | Massachusetts Institute of Technology |
| Truby, Ryan | Northwestern University |
| Chin, Lillian | Massachusetts Institute of Technology |
| Rus, Daniela | MIT |
Keywords: Modeling, Control, and Learning for Soft Robots, Performance Evaluation and Benchmarking, Soft Sensors and Actuators
Abstract: Machine learning-based approaches for soft robot proprioception have recently gained popularity, in part due to the difficulties in modeling the relationship between sensor signals and robot shape. However, to date, there exists no systematic analysis of the required design choices to set up a machine learning pipeline for soft robot proprioception. Here, we present the first study examining how design choices on different levels of the machine learning pipeline affect the performance of a neural network for predicting the state of a soft robot. We address the most frequent questions researchers face, such as how to choose the appropriate sensor and actuator signals, process input and output data, deal with time series, and pick the best neural network architecture. By testing our hypotheses on data collected from two vastly different systems--an electrically actuated robotic platform and a pneumatically actuated soft trunk--we seek conclusions that may generalize beyond one specific type of soft robot and hope to provide insights for researchers to use machine learning for soft robot proprioception.
|
| |
| 14:48-14:54, Paper MoBT6.9 | Add to My Program |
| Modeling and Analysis of Tendon-Driven Continuum Robots for Rod-Based Locking |
|
| Rao, Priyanka | University of Toronto |
| Pogue, Chloe | University of Toronto |
| Peyron, Quentin | Inria and CRIStAL UMR CNRS 9189, University of Lille |
| Diller, Eric D. | University of Toronto |
| Burgner-Kahrs, Jessica | University of Toronto |
Keywords: Modeling, Control, and Learning for Soft Robots, Flexible Robotics, Kinematics
Abstract: Various design modifications have been proposed for tendon-driven continuum robots to improve their stiffness and workspace. One of them is using locking mechanisms to constrain the lengths of rods or passive backbones along the backbone length. However, physics-based models used to predict these robots' behaviour commonly assume that the curvature of the locked portion does not change during robot actuation or that the effects of friction and gravity are negligible. In addition, these models do not consider the variation in twist on the application of force. In this letter, we propose a 3D static model for tendon-driven continuum robots experiencing locking due to length constraints on rods along their backbone. The proposed model is evaluated on prototypes of length 240 mm, with up to three locking mechanisms and has an accuracy of 3.63% w.r.t. length. Using the proposed model, a compliance analysis is performed studying the evolution of the robot compliance with the position of the locking mechanisms. An actuation strategy is proposed that can allow the robot to achieve the same shape with different compliance.
|
| |
| 14:54-15:00, Paper MoBT6.10 | Add to My Program |
| Path Planning Method with Constant Bending Angle Constraint for Soft Growing Robot Using Heat Welding Mechanism |
|
| Satake, Yuki | Waseda University |
| Ishii, Hiroyuki | Waseda University |
Keywords: Modeling, Control, and Learning for Soft Robots, Motion and Path Planning
Abstract: Soft growing robots, a new soft mobile robot, have recently attracted considerable interest. There are many soft growing robots, some of which have irreversible growing and bending motion mechanisms. For such robots, path planning methods that provide information on bending timing improve operation efficiency. Although various path planning methods have already been developed, they cannot be applied to our growing robot that uses a heat welding mechanism because it has a constraint that all bending angles are the constant value. This article proposes a novel path planning method with constant bending angle constraint. The proposed bending algorithm was developed based on the rapidly-exploring random tree star (RRT*) algorithm. The method incorporates an algorithm for reducing unnecessary nodes from obtained paths keeping bending angles constant and improving path optimality. We confirmed that the proposed method generates paths whose bending angles are constant. In addition, we experimented with moving our robot along the path in the field with some obstacles. The result showed that the proposed method enabled the robot to reach to target place, avoiding obstacles. The proposed method improves the operating efficiency of our soft growing robot.
|
| |
| 15:00-15:06, Paper MoBT6.11 | Add to My Program |
| Static Shape Control of Soft Continuum Robots Using Deep Visual Inverse Kinematic Models (I) |
|
| Almanzor, Elijah | University of Cambridge |
| Ye, Fan | University of Cambridge |
| Shi, Jialei | University College London |
| George Thuruthel, Thomas | University College London |
| Wurdemann, Helge Arne | University College London |
| Iida, Fumiya | University of Cambridge |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Robot Applications, Deep Learning in Robotics and Automation, Medical Robots and Systems
Abstract: Soft continuum robots are highly flexible and adaptable, making them ideal for unstructured environments such as the human body and agriculture. However, their high compliance and manoeuvrability make them difficult to model, sense, and control. Current control strategies focus on Cartesian space control of the end-effector, but few works have explored full-body control. This study presents a novel image-based deep learning approach for closed-loop kinematic shape control of soft continuum robots. The method combines a local inverse kinematics formulation in the image-space with deep convolutional neural networks for accurate shape control that is robust to feedback noise and mechanical changes in the continuum arm. The shape controller is fast and straightforward to implement; it takes only a few hours to generate training data, train the network, and deploy, requiring only a web camera for feedback. This method offers an intuitive and user-friendly way to control the robot's 3D shape and configuration through teleoperation using only 2D hand-drawn images of the desired target state without the need for further user instruction or consideration of the robot's kinematics.
|
| |
| 15:06-15:12, Paper MoBT6.12 | Add to My Program |
| Model Predictive Control Applied to Different Time-Scale Dynamics of Flexible Joint Robots |
|
| Iskandar, Maged | German Aerospace Center - DLR |
| van Ommeren, Christiaan | Technical University of Munich |
| Wu, Xuwei | German Aerospace Center (DLR) |
| Albu-Sch�ffer, Alin | DLR - German Aerospace Center |
| Dietrich, Alexander | German Aerospace Center (DLR) |
Keywords: Modeling, Control, and Learning for Soft Robots, Compliance and Impedance Control, Compliant Joints and Mechanisms
Abstract: Modern Lightweight robots are constructed to be collaborative, which often results in a low structural stiffness compared to conventional rigid robots. Therefore, the controller must be able to handle the dynamic oscillatory effect mainly due to the intrinsic joint elasticity. Singular perturbation theory makes it possible to decompose the flexible joint dynamics into fast and slow subsystems. This model separation provides additional features to incorporate future knowledge of the joint level dynamical behavior within the controller design using the Model Predictive Control (MPC) technique. In this study, different architectures are considered that combine the method of Singular Perturbation and MPC. For Singular Perturbation, the parameters that influence the validity of using this technique to control a flexible-joint robot are investigated. Furthermore, limits on the input constraints for the future trajectory are considered with MPC. The position control performance and robustness against external forces of each architecture are validated experimentally for a flexible joint robot. The experimental validation shows superior performance in practice for the presented MPC framework, especially respecting the actuator torque limits
|
| |
| 15:12-15:18, Paper MoBT6.13 | Add to My Program |
| A Framework for Simulation of Magnetic Soft Robots Using the Material Point Method |
|
| Davy, Joshua | University of Leeds |
| Lloyd, Peter Robert | University of Leeds |
| Chandler, James Henry | University of Leeds |
| Valdastri, Pietro | University of Leeds |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Robot Materials and Design, Simulation and Animation
Abstract: Simulation represents a key aspect in the develop- ment of robot systems. The ability to simulate behavior of real- world robots provides an environment where robot designs can be developed and control systems optimized. Due to the use of external magnetic fields for actuation, magnetic soft robots can be wirelessly controlled and are easily miniaturized. However, the relationship between magnetic soft materials and external sources of magnetic fields present significant complexities in modelling due to the relationship between material elasticity and magnetic wrench (forces and torques). In this work, we present a simulation framework for magnetic soft robots using the Material Point Method (MPM) which integrates hyper- elastic material models with the magnetic wrench induced under external fields. Compared to existing Finite Element Methods (FEM), the presented MPM based framework inher- ently models self-collision between areas of the model and can capture the effect of forces in non-homogeneous magnetic fields. We demonstrate the ability of the MPM framework to model the influence of magnetic wrench on magnetic soft robots, capture dynamic behavior of robots under time-varying magnetic fields, and provide an accurate representation of deformation when colliding with obstacles. We show the versatility of MPM framework by comparing simulations to a range of real- world magnetic soft robot designs previously presented in the literature.
|
| |
| MoBT7 Regular session, 258/259 |
Add to My Program |
| Micro and Nano Robotics |
|
| |
| Chair: Cappelleri, David | Purdue University |
| Co-Chair: Tan, Liyuan | Purdue University |
| |
| 14:00-14:06, Paper MoBT7.1 | Add to My Program |
| Design, Fabrication, and Characterization of a Helical Adaptive Multi-Material MicroRobot (HAMMR) |
|
| Tan, Liyuan | Purdue University |
| Cappelleri, David | Purdue University |
Keywords: Micro/Nano Robots, Medical Robots and Systems
Abstract: Adaptive locomotion is an advanced function of microrobots that can be achieved using smart materials. In this paper, a responsive hydrogel is utilized as a smart material and used to fabricate Helical Adaptive Multi-material MicroRobots (HAMMRs) with deformable tails to achieve adaptive locomotion capabilities. Moreover, a novel fabrication method is proposed to realize these types of helical microrobots with enhanced swimming performances by taking advantage of a strong magnetic head and deformable tail. The deformations of different tail designs and the fabricated microrobots are tested in different solvents. The swimming performances of the swimming microrobots are investigated experimentally under a rotating magnetic field and verified with theoretical calculations. The HAMMRs show significant deformations upon stimulation and changes in swimming performance which are in agreement with the scaled calculation result. Finally, the HAMMRs present an enhanced mobility with a highest published translational velocity for an adaptive swimming microrobot of 8.1 body lengths per second.
|
| |
| 14:06-14:12, Paper MoBT7.2 | Add to My Program |
| Active Capsule System for Multiple Therapeutic Patch Delivery: Preclinical Evaluation |
|
| Lee, Jihun | Daegu Gyeongbuk Institute of Science and Technology |
| Hoang, Manh Cuong | Chonnam National University |
| Kim, Jayoung | Korea Institute of Medical Microrobotics |
| Choe, Eunho | Korea Institute of Medical Microrobotics |
| Kee, Hyeonwoo | DGIST |
| Yang, Seungun | DGIST |
| Park, Jongoh | Chonnam National University |
| Park, Sukho | DGIST |
Keywords: Micro/Nano Robots, Medical Robots and Systems, Automation at Micro-Nano Scales
Abstract: Recently, active research has been conducted on the therapeutic functions of capsule endoscopes. Here, we propose an active capsule system that captures images of the interior of the gastrointestinal tract (GI) and actively delivers therapeutic patches. The active capsule system mainly comprises therapeutic patches, an active capsule equipped with a camera, and a robot-assisted magnetic actuator. The active capsule moves inside the GI tract via a magnetic actuator using a robot, captures pictures of the GI tract in actual time, and performs hemostatic treatment by delivering therapeutic patches to the target lesions. First, the fundamental performance of the active capsule system was verified via a hemostatic performance test of the therapeutic patch, patch contamination prevention test of the active capsule, and basic actuation test of the capsule. Second, multiple therapeutic patches were delivered to the gastric surface in an ex vivo test using an active capsule system. Finally, as a preclinical test, it was confirmed that the GI tract examination and the therapeutic patches delivery were possible using the active capsule system through an animal test using a porcine. Consequently, the proposed active capsule system represents a new paradigm for capsule endoscopy with multiple therapeutic patch delivery capabilities.
|
| |
| 14:12-14:18, Paper MoBT7.3 | Add to My Program |
| Parallel Cell Array Patterning and Target Cell Lysis on an Optoelectronic Micro-Well Device |
|
| Gan, Chunyuan | Beihang University |
| Xiong, Hongyi | Beihang University |
| Zhao, Jiawei | Beihang University, School of Mechanical Engineering and Automati |
| Wang, Ao | BUAA |
| Wang, Chutian | Beihang University |
| Liang, Shuzhang | Beihang University |
| Zhang, Jiaying | Beihang University, School of Mechanical Engineering &Automation |
| Feng, Lin | Beihang University |
Keywords: Biological Cell Manipulation, Automation at Micro-Nano Scales, Micro/Nano Robots
Abstract: This work presents a novel electrical method, implemented in the form of a microfluidic device, for cell arraying and target cell lysis. The microfluidic device contains a micro-well array on the photoconductive layer based on the optoelectronic tweezers (OET) method, where parallel cell manipulation is performed. As cell suspension flows over the micro-wells, cells can be actively captured in the micro-wells by light-induced dielectrophoresis (DEP) forces, form the designed pattern array in less than 120 s. The single-cell capture rate is over 83% in the patterned cell array, and about 94% of micro-wells are occupied by cells. Then, the target cell in the specific micro-well is illuminated and lysed by electroporation in 5 seconds. The micro-well barriers and DEP forces block the influence of the flow, and a relatively closed space is critical to preserve the cell lysates. Through experiments, light-induced DEP force cell capture and target cell electroporation can be modulated by changing the light patterns and the applied signal. This device, based on the OET and dynamic electroporation, allows the rapidity in the cell capture and target lysis at the single-cell level and can enable single-cell-based studies, such as molecular diagnostics and disease detection.
|
| |
| 14:18-14:24, Paper MoBT7.4 | Add to My Program |
| Microrobot Control Method Based on Movement of Field Free Point in Gradient Magnetic Field |
|
| Wang, Chutian | Beihang University |
| Ji, Yiming | Beihang University |
| Luo, Xinyun | Beihang University |
| Gan, Chunyuan | Beihang University |
| Wang, Ao | BUAA |
| Zhao, Jiawei | Beihang University, School of Mechanical Engineering and Automati |
| Wang, Luyao | Beihang University |
| Feng, Lin | Beihang University |
Keywords: Micro/Nano Robots, Force Control, Motion Control
Abstract: The untethered microrobots driven by multiple external physics fields have promising ability in minimally invasive disease treatments. One common type of the driving fields is gradient magnetic field, which can provide microrobots with adequate driving force in complicated environment. In this study, a control method of microrobot through gradient magnetic field system is presented, which is realized by moving the field free point (FFP) to produce an alterable magnetic driving force. A confirmatory experiment of the robot reciprocating motion control is undertaken in a 1D gradient magnetic robot system. The control method could be applied to further studies on in vivo applications of targeted microrobot drug delivery system.
|
| |
| 14:24-14:30, Paper MoBT7.5 | Add to My Program |
| Helical Propulsion in Low-Re Numbers with Near-Zero Angle of Attack |
|
| Ligtenberg, Leendert-Jan Wouter | University of Twente |
| Ekkelkamp, Ilse Alena Antonia | University of Twente |
| Halfwerk, Frank | University of Twente |
| Goulas, Constantinos | University of Twente |
| Arens, Jutta | University of Twente |
| Warle, Michiel | Radboud University Medical Center |
| Khalil, Islam S.M. | University of Twente |
Keywords: Micro/Nano Robots, Medical Robots and Systems
Abstract: One approach to the wireless actuation and gravity compensation of untethered helical magnetic devices (UHMD) is through swimming with a non-zero angle of attack (AoA). This configuration allows us to counteract gravity, so that for a given desired path, we can move the UHMD controllably without drifting downward under its own weight. This study seeks to investigate the use a reduced-order model of the complex 6-degrees-of-freedom model of UHMDs in low Reynolds-number regime. A one-dimensional model representing the relative position of the UHMD with respect to an actuator rotating permanent magnet is used to predict a gap which yields bounded behavior of the open-loop system. Using geometric representation of the reduced-order model, the local bounded behavior of the UHMD with near-zero AoA is attributed to periodic active magnetic suspension, which dominates near-zero AoA. Our numerical results are verified experimentally and bounded behavior of the UHMD demonstrates the capability to swim with near-zero AoA (6.3◦ � 2.2◦) without drifting downward. With this actuation strategy, it is unlikely that the orientation of the UHMD will be needed during noninvasive localization, making the control system dependent on only its position with respect to a prescribed trajectory. This strategy will also provide a computational advantage in adjusting the gap between the UHMD and a robotically controlled rotating permanent magnet actuator.
|
| |
| 14:30-14:36, Paper MoBT7.6 | Add to My Program |
| Influence of Nanoparticle Coating on the Differential Magnetometry and Wireless Actuation of Biohybrid Microrobots |
|
| Magdanz, Veronika | University of Waterloo |
| Cumming, Jack | University of Twente |
| Salamzadeh, Sadaf | University of Twente |
| Tesselaar, Sven | University of Twente |
| Lejla, Alic | University of Twente |
| Abelmann, Leon | University of Twente |
| Khalil, Islam S.M. | University of Twente |
Keywords: Micro/Nano Robots
Abstract: Magnetic nanoparticles can be electrostatically assembled around sperm cells to form biohybrid microrobots. These biohybrid microrobots possess sufficient magnetic material to potentially allow for pulse-echo localization and wireless actuation. Alternatively, magnetic excitation of these nanoparticles can be used for localization based on Faraday�s law of induction using a detection coil. Here, we investigate the influence of the electrostatic attraction between positively charged nanoparticles and negatively charged sperm cells on the activation of the nanoparticles during nonlinear differential magnetometry and wireless magnetic actuation. Activation of clusters of free nanoparticles and nanoparticles bound to the body of sperm cells is achieved by a combination of a highfrequency alternating field and a pulsating static field. The nonlinear response in both cases indicates that constraining the nanoparticles is likely to yield significant decreases in the magnetometry sensitivity. While the attachment of particles to the cells enables wireless actuation (rolling locomotion), the rate of change of the magnetization of the nanoparticles decreases one order of magnitude compared to free nanoparticles.
|
| |
| 14:36-14:42, Paper MoBT7.7 | Add to My Program |
| Using Piezoceramic-Actuated Stages in Precision Long-Stroke Motion Systems: A Design Procedure |
|
| Al-Rawashdeh, Yazan | Memorial University of Newfoundland |
| Al Saaideh, Mohammad | Memorial University of Newfoundland |
| Al Janaideh, Mohammad | University of Guelph |
Keywords: Automation at Micro-Nano Scales
Abstract: Mainly, the integration of fine positioning piezo-actuated stages in precision motion systems is considered, which results in multi-stage configurations. Mostly, in such configurations, the fine stages are attached to the coarse positioning stages- that do not meet required precision- by mechanical means. Once the motion is synchronized, the fine stages enhance the overall precision of the multi-stage system. Undesirably, mechanical, and electromagnetic interference between the involved stages take place, which may limit the possible attainable precision. To control the fine stages, we propose the use of feedforward control based on the Prandtl�Ishlinskii model inverse in an attempt to accommodate related piezoceramics dynamic behavior and hysteresis. Targeting the semiconductor manufacturing, the needed multi-stage design steps according to the herein proposed approach are outlined. Also, the performance of a representative precision motion system comprising a planner coarse stage, and a uni-axial fine stage under step-and-scan trajectories is assessed. The results show that the proposed piezo-actuated fine stage improves the scanning accuracy of the overall motion system.
|
| |
| 14:42-14:48, Paper MoBT7.8 | Add to My Program |
| Buoyancy Enabled Non-Inertial Dynamic Walking |
|
| Yim, Mark | University of Pennsylvania |
| Gosrich, Walker | University of Pennsylvania |
| Miskin, Marc | University of Pennsylvania |
Keywords: Micro/Nano Robots, Legged Robots
Abstract: We propose a mechanism for low Reynolds num- ber walking (e.g., legged microscale robots). Whereas loco- motion for legged robots has traditionally been classified as dynamic (where inertia plays a role) or static (where the system is always statically stable), we introduce a new locomotion modality we call buoyancy enabled non-inertial dynamic walking in which inertia plays no role, yet the robot is not statically stable. Instead, falling and viscous drag play critical roles. This model assumes squeeze flow forces from fluid interactions combined with a well timed gait as the mechanism by which forward motion can be achieved from a reciprocating legged robot. Using two physical demonstrations of robots with Reynold�s number ranging from 0.0001 to 0.02 (a microscale robot in water and a centimeter scale robot in glycerol) we find the model qualitatively describes the motion. This model can help understand microscale locomotion and design new microscale walking robots including controlling forward and backwards motion and potentially steering these robots.
|
| |
| 14:48-14:54, Paper MoBT7.9 | Add to My Program |
| Ultrafast Acoustic Holography with Physics-Reinforced Self-Supervised Learning for Precise Robotic Manipulation |
|
| Lu, Qingyi | Shanghaitech University |
| Zhong, Chengxi | ShanghaiTech University |
| Liu, Qing | Shanghaitech University |
| Li, Teng | Tsinghua University |
| Su, Hu | Institute of Automation, Chinese Academy of Science |
| Liu, Song | ShanghaiTech University |
Keywords: Micro/Nano Robots, Dexterous Manipulation, Deep Learning Methods
Abstract: Ultrafast acoustic holography (AH) enabling dynamic contactless micro-nano robotic manipulation has recently attracted wide attention. As an advanced technique, AH encodes specific three-dimensional (3D) acoustic field on a two-dimensional (2D) hologram whereby realizing holographic reconstruction with high fidelity. However, current approaches face the limitation of encoding time, accuracy and flexibility, thus, leading to inapplicability for dynamic and precise robotic manipulation. Here, we develop an approach to overcome these issues. Its basic idea is to use a convolutional neural network trained in a self-supervised manner with iterative interaction with virtual physical environment. Energy conservation is incorporated to access the physical constrain during wave propagation. The experimental results demonstrate that the proposed method circumvents laborious annotated dataset preparation and boosts the reinforcement from physics model. By the validation and comparison on distinct acoustic fields with various patterns, the accuracy and real-time performance of the proposed method are confirmed supporting dynamic and precise robotic manipulation.
|
| |
| 14:54-15:00, Paper MoBT7.10 | Add to My Program |
| Surface Navigation of Alginate Artificial Cells in Mucus Solutions |
|
| Rogowski, Louis | Applied Research Associates |
| Wood, Justin | Applied Research Associates |
| Cooke, Tobias | Applied Research Associates |
| Kararsiz, Gokhan | Southern Methodist University |
| Kim, MinJun | Southern Methodist University |
Keywords: Micro/Nano Robots, Medical Robots and Systems, Soft Robot Applications
Abstract: Alginate hydrogels are widely researched in pharmaceutical applications for their abilities to encapsulate and disperse therapeutics in response to stimuli. While effective, their utility can be greatly improved once converted into artificial cell soft-microrobots, allowing them to actively navigate through complex in vivo environments and facilitate targeted drug delivery. In this study, artificial cells were fabricated by crosslinking alginate with magnetic nanoparticles and then deployed within mucus solutions to characterize their propulsion capabilities. The goal of this study was to understand how variations in simplified gastrointestinal fluid, artificial cell properties, and magnetic field characteristics could affect surface locomotion. A comparison between automatic feedback control and manual �open-loop� operation was also quantitatively explored. Under feedback control, individual artificial cells were navigated with automatically generated waypoints and a PID controller. Simulations were used to verify controller performance and accuracy. User operation was carried out using an Xbox controller, where the joystick could directly change navigation direction. We conclude in this study that the surface navigation of artificial cells is highly predictable within mucus concentrations and that both feedback and open-loop control are equally successful in navigation.
|
| |
| 15:00-15:06, Paper MoBT7.11 | Add to My Program |
| Design and Control of Microscale Dual Locomotion Mode Multi-Functional Robots (μDMMFs) |
|
| Davis, Aaron C. | Purdue University |
| Cappelleri, David | Purdue University |
Keywords: Micro/Nano Robots
Abstract: This paper presents the design and control of a novel microrobot that utilizes two distinct magnetic locomotion methods, a combination of rotating and gradient field control, for precise micro-object manipulation using multiple end-effectors. Rotating magnetic fields induce a tumbling locomotion mode to increase the movement speed and decrease issues associated with stiction and locomotion over rough surfaces. The gradient field control allows for precise manipulation using the end-effectors, which include a pointed tip for splitting groups of objects and a blunt end for pushing or capturing objects. The microrobot is fabricated using a two-photon polymerization 3D printer, allowing for the precise reproduction of complex geometries and designs. The potential applications of this technology in the medical field are discussed, highlighting the potential for in vitro cellular manipulation.
|
| |
| 15:06-15:12, Paper MoBT7.12 | Add to My Program |
| A New 1-Mg Fast Unimorph SMA-Based Actuator for Microrobotics |
|
| Trygstad, Conor | Washington State University |
| Nguyen, Xuan-Truc | University of Southern California |
| Perez-Arancibia, Nestor O | Washington State University (WSU) |
Keywords: Micro/Nano Robots, Biologically-Inspired Robots, Methods and Tools for Robot System Design
Abstract: We present a new unimorph actuator for microrobotics, which is driven by thin shape-memory alloy (SMA) wires. Using a passive-capillary-alignment technique and existing SMA-microsystem fabrication methods, we developed an actuator that is 7 mm long, has a volume of 0.45 mm 3, weighs 0.96 mg, and can achieve operation frequencies of up to 40 Hz as well as lift 155 times its own weight. To demonstrate the capabilities of the proposed actuator, we created an 8-mg crawler, the MiniBug, and a bioinspired 56-mg controllable water-surface-tension crawler, the WaterStrider. The MiniBug is 8.5 mm long, can locomote at speeds as high as 0.76 BL/s ( body-lengths per second), and is the lightest fully-functional crawling microrobot of its type ever created. The WaterStrider is 22 mm long, and can locomote at speeds of up to 0.28 BL/s as well as execute turning maneuvers at angular rates on the order of 0.144 rad/s. The WaterStrider is the lightest controllable SMA-driven water-surface-tension crawler developed to date.
|
| |
| 15:12-15:18, Paper MoBT7.13 | Add to My Program |
| Toward Sub-Gram Helicopters: Designing a Miniaturized Flybar for Passive Stability |
|
| Johnson, Kyle | University of Washington Paul G. Allen School for Computer Scien |
| Arroyos, Vicente | University of Washington |
| Villanueva, Raul | University of Washington |
| Schulz, Adriana | MIT |
| Fuller, Sawyer | University of Washington |
| Iyer, Vikram | University of Washington |
Keywords: Micro/Nano Robots, Mechanism Design, Aerial Systems: Mechanics and Control
Abstract: Sub-gram flying robots have transformative potential in applications from search and rescue to precision agriculture to environmental monitoring. However, a key gap in achieving autonomous flight for these applications is the low lift to weight ratio of flapping wing and quadrotor designs around 1~g or less. To close this gap, we propose a helictoper-style design that minimizes size and weight by leveraging the high lift, reliability, and low-voltage of sub-gram motors. We take an important step to enable this goal by designing a light-weight, micfrofabricated flybar mechanism to passively stabilize such a robot. Our 48 mg flybar is folded from a flat carbon fiber laminate into a 3D mechanism that couples tilting of the flybar to a change in the angle of attack of the rotors. Our design uses flexure joints instead of ball-in-socket joints common in larger flybars. To expedite the design exploration and optimization of a microfabricated flat-folded flybar, we develop a novel user-in-the-loop bi-level optimization workflow that combines Bayesian optimization design tools and expert feedback. We develop four template designs and use this method to achieve a peak damping ratio of 0.528, an 18.9x improvement from our initial design. Compared to a flybar-less rotor with a near 0 damping ratio, our flybar-rotor mechanism maintains a stable roll and pitch with relative deviations <1�. Our results show that, if combined with a counter-torque mechanism such as a tail rotor, our miniaturized flybar could mechanically provide attitude stability for a sub-gram helicopter.
|
| |
| 15:18-15:24, Paper MoBT7.14 | Add to My Program |
| Manipulation of Optical Force-Induced Micro-Assemblies at the Air-Liquid Interface |
|
| Carlisle, Nicholas | Massey University |
| Williams, Martin | Massey University |
| Whitby, Catherine | Massey University |
| Nock, Volker | University of Canterbury |
| Chen, Jack L Y | AUT |
| Avci, Ebubekir | Massey University |
Keywords: Micro/Nano Robots, Automation at Micro-Nano Scales, Swarm Robotics
Abstract: Colloidal particles trapped by a focused laser at the air-liquid interface provide an interesting assembly dynamic. In this study, we demonstrated manipulating optical force-induced swarms via dynamic locomotion of assemblies built with holographic optical tweezers. This manipulation approach builds the foundation for autonomous control of building assemblies at the air-liquid interface, which is the first time optical micro-robots have performed this feat. Our proposed semi-autonomous control allows users to produce small dynamic secondary assemblies at the interface, which are transported to and merged with a main static assembly. This static-dynamic approach grows assemblies up to ∼2.1 times larger than conventional methods. Manipulation and control of large-scale optical force-induced assemblies in real-time to create re-configurable swarms has the potential to lead the development of new technology and approaches for complex tasks, such as the development of new material, transportation of biological matter, studying biofilm formation created by bacteria colonies at the air-liquid interface, and more.
|
| |
| MoBT8 Regular session, 141 |
Add to My Program |
| Legged Robots II |
|
| |
| Chair: Lee, Dongjun | Seoul National University |
| Co-Chair: Clark, Jonathan | Florida State University |
| |
| 14:00-14:06, Paper MoBT8.1 | Add to My Program |
| Creating a Dynamic Quadrupedal Robotic Goalkeeper with Reinforcement Learning |
|
| Huang, Xiaoyu | Georgia Institute of Technology |
| Li, Zhongyu | University of California, Berkeley |
| Xiang, Yanzhen | ETH Zurich |
| Ni, Yiming | University of California Berkeley |
| Chi, Yufeng | University of California, Berkeley |
| Li, Yunhao | University of California, Berkeley |
| Yang, Lizhi | California Institute of Technology |
| Peng, Xue Bin | Simon Fraser University |
| Sreenath, Koushil | University of California, Berkeley |
Keywords: Legged Robots, Reinforcement Learning, Whole-Body Motion Planning and Control
Abstract: We present a reinforcement learning (RL) framework that enables quadrupedal robots to perform soccer goalkeeping tasks in the real world. Soccer goalkeeping with quadrupeds is a challenging problem, that combines highly dynamic locomotion with precise and fast non-prehensile object (ball) manipulation. The robot needs to react to and intercept a potentially flying ball using dynamic locomotion maneuvers in a very short amount of time, usually less than one second. In this paper, we propose to address this problem using a hierarchical model-free RL framework. The first component of the framework contains multiple control policies for distinct locomotion skills, which can be used to cover different regions of the goal. Each control policy enables the robot to track random parametric end-effector trajectories while performing one specific locomotion skill, such as jump, dive, and sidestep. These skills are then utilized by the second part of the framework which is a high-level planner to determine a desired skill and end-effector trajectory in order to intercept a ball flying to different regions of the goal. We deploy the proposed framework on a Mini Cheetah quadrupedal robot and demonstrate the effectiveness of our framework for various agile interceptions of a fast-moving ball in the real world.
|
| |
| 14:06-14:12, Paper MoBT8.2 | Add to My Program |
| Walking in Narrow Spaces: Safety-Critical Locomotion Control for Quadrupedal Robots with Duality-Based Optimization |
|
| Liao, Qiayuan | University of California, Berkeley |
| Li, Zhongyu | University of California, Berkeley |
| Thirugnanam, Akshay | University of California, Berkeley |
| Zeng, Jun | University of California, Berkeley |
| Sreenath, Koushil | University of California, Berkeley |
Keywords: Legged Robots, Collision Avoidance, Optimization and Optimal Control
Abstract: This paper presents a safety-critical locomotion control framework for quadrupedal robots. Our goal is to enable quadrupedal robots to safely navigate in cluttered environments. To tackle this, we introduce exponential Discrete Control Barrier Functions~(exponential DCBFs) with duality-based obstacle avoidance constraints into a Nonlinear Model Predictive Control~(NMPC) with Whole-Body Control~(WBC) framework for quadrupedal locomotion control. This enables us to use polytopes to describe the shapes of the robot and obstacles for collision avoidance while doing locomotion control of quadrupedal robots. Compared to most prior work, especially using CBFs, that utilize spherical and conservative approximation for obstacle avoidance, this work demonstrates a quadrupedal robot autonomously and safely navigating through very tight spaces in the real world.
|
| |
| 14:12-14:18, Paper MoBT8.3 | Add to My Program |
| ARMP: Autoregressive Motion Planning for Quadruped Locomotion and Navigation in Complex Indoor Environments |
|
| Kim, Jeonghwan | Georgia Institute of Technology |
| Li, Tianyu | Facebook |
| Ha, Sehoon | Georgia Institute of Technology |
Keywords: Legged Robots, Task and Motion Planning, Simulation and Animation
Abstract: Generating natural and physically feasible motions for legged robots has been a challenging problem due to its complex dynamics. In this work, we introduce a novel learning-based framework of autoregressive motion planner (ARMP) for quadruped locomotion and navigation. Our method can generate motion plans with an arbitrary length in an autoregressive fashion, unlike most offline trajectory optimization algorithms for a fixed trajectory length. To this end, we first construct the motion library by solving a dense set of trajectory optimization problems for diverse scenarios and parameter settings. Then we learn the motion manifold from the dataset in a supervised learning fashion. We show that the proposed ARMP can generate physically plausible motions for various tasks and situations. We also showcase that our method can be successfully integrated with the recent robot navigation frameworks as a low-level controller and unleash the full capability of legged robots for complex indoor navigation.
|
| |
| 14:18-14:24, Paper MoBT8.4 | Add to My Program |
| Perceptive Hexapod Legged Locomotion for Climbing Joist Environments |
|
| Zang, Zixian | University of California, Berkeley |
| Kawawa-Beaudan, Maxime | J.P. Morgan AI Research |
| Yu, Wenhao | Google |
| Zhang, Tingnan | Google |
| Zakhor, Avideh | University of California, Berkeley |
Keywords: Legged Robots, Reinforcement Learning
Abstract: Attics are one of the largest sources of energy loss in residential homes, but they are uncomfortable and dangerous for human workers to conduct air sealing and insulation. Hexapod robots are potentially suitable for carrying out those tasks in tight attic spaces since they are stable, compact, and lightweight. For hexapods to succeed in these tasks, they must be able to navigate inside tight attic spaces of single-family residential homes in the U.S., which typically contain rows of approximately 6 or 8-inch tall joists placed 16 inches apart from each other. Climbing over such obstacles is challenging for autonomous robotics systems. In this work, we develop a perceptive walking model for legged hexapods that can traverse over terrain with random joist structures using egocentric vision. Our method can be used on low-cost hardware not requiring real-time joint state feedback. We train our model in a teacher-student fashion with 2 phases: In phase 1, we use reinforcement learning with access to privileged information such as local elevation maps and joint feedback. In phase 2, we use supervised learning to distill the model into one with access to only onboard observations, consisting of egocentric depth images and robot orientation captured by a tracking camera. We demonstrate zero-shot sim-to-real transfer on a SpiderPi robot, equipped with a depth camera onboard, climbing over joist courses we construct to simulate the environment in the field. Our proposed method achieves nearly 100% success rate climbing over the test courses, significantly outperforming the model without perception and the controller provided by the manufacturer.
|
| |
| 14:24-14:30, Paper MoBT8.5 | Add to My Program |
| Design of STARQ: A Multimodal Quadrupedal Robot for Running, Climbing, and Swimming |
|
| Vasquez, Derek A. | Florida State University |
| Jay, David | FAMU-FSU College of Engineering |
| Dina, Michael | Florida State University |
| Austin, Max | Florida State University |
| McConomy, Shayne | FAMU - FSU College of Engineering |
| Clark, Jonathan | Florida State University |
Keywords: Legged Robots, Climbing Robots, Biologically-Inspired Robots
Abstract: Legged animals have developed a variety of modes of locomotion to adapt to the diverse and unknown terrain challenges posed in the natural world. Legged robots, however, have been largely limited to specializing in one domain, with few that have endeavored to bridge the gap between two. In this work we present the Scansorial, Terrestrial, and Aquatic Robot Quadruped (STARQ), a novel legged robot capable of bridging three different domains with three modes of locomotion: walking, climbing, and swimming. In this study we describe model-based design techniques as well as design innovations that have made multimodal locomotion possible including waterproof hips for 2-DOF high torque legs, legs capable of effective power transmission in three modes, and bi-directionally compliant feet for walking and attaching to vertical surfaces. To demonstrate the robot's capabilities we present locomotion test data including speed and cost of transport in each of these domains. We also demonstrate the capability to transition from walking to swimming in a natural environment.
|
| |
| 14:30-14:36, Paper MoBT8.6 | Add to My Program |
| Hierarchical Adaptive Control for Collaborative Manipulation of a Rigid Object by Quadrupedal Robots |
|
| Sombolestan, Mohsen | University of Southern California |
| Nguyen, Quan | University of Southern California |
Keywords: Legged Robots, Robust/Adaptive Control, Mobile Manipulation
Abstract: Despite the potential benefits of collaborative robots, effective manipulation tasks with quadruped robots remain difficult to realize. In this paper, we propose a hierarchical control system that can handle real-world collaborative manipulation tasks, including uncertainties arising from object properties, shape, and terrain. Our approach consists of three levels of controllers. Firstly, an adaptive controller computes the required force and moment for object manipulation without prior knowledge of the object's properties and terrain. The computed force and moment are then optimally distributed between the team of quadruped robots using a Quadratic Programming (QP)-based controller. This QP-based controller optimizes each robot's contact point location with the object while satisfying constraints associated with robot-object contact. Finally, a decentralized loco-manipulation controller is designed for each robot to apply manipulation force while maintaining the robot's stability. We successfully validated our approach in a high-fidelity simulation environment where a team of quadruped robots manipulated an unknown object weighing up to 18 kg on different terrains while following the desired trajectory.
|
| |
| 14:36-14:42, Paper MoBT8.7 | Add to My Program |
| Proprioception and Reaction for Walking among Entanglements |
|
| Yim, Justin K. | University of Illinois Urbana-Champaign |
| Ren, Jiming | Carnegie Mellon University |
| Ologan, David | Carnegie Mellon University |
| Garcia Gonzalez, Selvin Orlando | Carnegie Mellon University |
| Johnson, Aaron M. | Carnegie Mellon University |
Keywords: Legged Robots, Force and Tactile Sensing
Abstract: Entanglements like vines and branches in natural settings or cords and pipes in human spaces prevent mobile robots from accessing many environments. Legged robots should be effective in these settings, and more so than wheeled or tracked platforms, but naive controllers quickly become entangled and stuck. In this paper we present a method for proprioception aimed specifically at the task of sensing entanglements of a robot's legs as well as a reaction strategy to disentangle legs during their swing phase as they advance to their next foothold. We demonstrate our proprioception and reaction strategy enables traversal of entanglements of many stiffnesses and geometries succeeding in 14 out of 16 trials in laboratory tests, as well as a natural outdoor environment.
|
| |
| 14:42-14:48, Paper MoBT8.8 | Add to My Program |
| Learning a Single Policy for Diverse Behaviors on a Quadrupedal Robot Using Scalable Motion Imitation |
|
| Klipfel, Arnaud | Georgia Tech |
| Sontakke, Nitish Rajnish | Georgia Institute of Technology |
| Liu, Ren | Georgia Institute of Technology |
| Ha, Sehoon | Georgia Institute of Technology |
Keywords: Legged Robots, Reinforcement Learning, Imitation Learning
Abstract: Learning various motor skills for quadrupedal robots is a challenging problem that requires careful design of task-specific mathematical models or reward descriptions. In this work, we propose to learn a single capable policy using deep reinforcement learning by imitating a large number of reference motions, including walking, turning, pacing, jumping, sitting, and lying. On top of the existing motion imitation framework, we first carefully design the observation space, the action space, and the reward function to improve the scalability of the learning as well as the robustness of the final policy. In addition, we adopt a novel adaptive motion sampling (AMS) method, which maintains a balance between successful and unsuccessful behaviors. This technique allows the learning algorithm to focus on challenging motor skills and avoid catastrophic forgetting. We demonstrate that the learned policy can exhibit diverse behaviors in simulation by successfully tracking both the training dataset and out-of-distribution trajectories. We also validate the importance of the proposed learning formulation and the adaptive motion sampling scheme by conducting experiments.
|
| |
| 14:48-14:54, Paper MoBT8.9 | Add to My Program |
| A Novel Lockable Spring-Loaded Prismatic Spine to Support Agile Quadrupedal Locomotion |
|
| Ye, Keran | University of California, Riverside |
| Chung, Kenneth | University of California, Riverside |
| Karydis, Konstantinos | University of California, Riverside |
Keywords: Legged Robots, Mechanism Design, Compliant Joints and Mechanisms
Abstract: This paper introduces a way to systematically investigate the effect of compliant prismatic spines in quadrupedal robot locomotion. We develop a novel spring-loaded lockable spine module, together with a new Spinal Compliance-Integrated Quadruped (SCIQ) platform for both empirical and numerical research. Individual spine tests reveal beneficial spinal characteristics like a degressive spring, and validate the efficacy of a proposed compact locking/unlocking mechanism for the spine. Benchmark vertical jumping and landing tests with our robot show comparable jumping performance between the rigid and compliant spines. An observed advantage of the compliant spine module is that it can alleviate more challenging landing conditions by absorbing impact energy and dissipating the remainder via feet slipping through much in cat-like stretching fashion.
|
| |
| 14:54-15:00, Paper MoBT8.10 | Add to My Program |
| Tunable Impact and Vibration Absorbing Neck for Robust Visual-Inertial State Estimation for Dynamic Legged Robots |
|
| Kim, Taekyun | Seoul National University |
| Kim, Sangbae | Massachusetts Institute of Technology |
| Lee, Dongjun | Seoul National University |
Keywords: Legged Robots, Mechanism Design, Visual-Inertial SLAM
Abstract: We propose a new neck design for legged robots to achieve robust visual-inertial state estimation in dynamic locomotion. While visual-inertial state estimation is widely used in robotics, it has a problem of being disturbed by the impacts and vibration generated when legged robots move dynamically. The use of rubber dampers may be a solution, but even if the dampers are proper for some gaits, they may be excessively deformed or resonated at certain frequencies during other gait locomotion since they are not tunable. To address this problem, we develop a tunable neck system that absorbs the impacts and vibration during diverse gait locomotions. This neck system consists of two components: 1) a suspension mechanism that compensates for the weight of the head equipped with a camera and IMU (inertial measurement unit), absorbs the impacts and the head motion of high frequencies including vibration as a fixed low-pass filter; and 2) a dynamic vibration absorber (DVA) that can be reactively-adjusted to diverse gait frequencies to alleviate excessive head movements. We present a dynamics analysis of the neck system and show how to adjust the target frequency of the system. Simulation and experimental validation are performed to verify the effect of the proposed neck design, manifesting superior estimation performance and robustness across diverse gaits.
|
| |
| 15:00-15:06, Paper MoBT8.11 | Add to My Program |
| Embodying Quasi-Passive Modal Trotting and Pronking in a Sagittal Quadruped |
|
| Calzolari, Davide | German Aerospace Center, Technical University of Munich |
| Della Santina, Cosimo | TU Delft |
| Giordano, Alessandro Massimo | DLR (German Aerospace Center) |
| Schmidt, Annika | Technical University of Munich (TUM) |
| Albu-Sch�ffer, Alin | DLR - German Aerospace Center |
Keywords: Legged Robots, Natural Machine Motion, Passive Walking
Abstract: Animals rely on the elasticity of their tendons and muscles to execute robust and efficient locomotion patterns for a vast and continuous range of velocities. Replicating such capabilities in artificial systems is a long-lasting challenge in robotics. By taking advantage of a pitch dynamics decoupling spring potential, this work aims to provide design rules and a control strategy to generate dynamic, efficient locomotion patterns in quadrupeds moving in a sagittal plane. We rely on nonlinear modal theory, which provides the tools to characterize continuous families of efficient oscillations in nonlinear mechanical systems. We provide simulations of an elastic quadruped showing that the proposed solution can robustly excite efficient locomotion patterns under non-ideal conditions.
|
| |
| 15:06-15:12, Paper MoBT8.12 | Add to My Program |
| Design, Modeling and Control of a Quadruped Robot SPIDAR: Spherically Vectorable and Distributed Rotors Assisted Air-Ground Quadruped Robot |
|
| Zhao, Moju | The University of Tokyo |
| Anzai, Tomoki | The University of Tokyo |
| Nishio, Takuzumi | The University of Tokyo |
Keywords: Legged Robots, Aerial Systems: Mechanics and Control, Motion Control
Abstract: Multimodal locomotion capability is an emerging topic in robotics field, and various novel mobile robots have been developed to enable the maneuvering in both terrestrial and aerial domains. Among these hybrid robots, several state-ofthe- art bipedal robots enable the complex walking motion which is interlaced with flying. These robots are also desired to have the manipulation ability; however, it is difficult for the current forms to keep stability with the joint motion in midair due to the centralized rotor arrangement. Therefore, in this work, we develop a novel air-ground quadruped robot called SPIDAR which is assisted by spherically vectorable rotors distributed in each link to enable both walking motion and transformable flight. First, we present a unique mechanical design for quadruped robot that enables terrestrial and aerial locomotion. We then reveal the modeling method for this hybrid robot platform, and further develop an integrated control strategy for both walking and flying with joint motion. Finally, we demonstrate the feasibility of the proposed hybrid quadruped robot by performing a seamless motion that involves static walking and subsequent flight. To the best of our knowledge, this work is the first to achieve a quadruped robot with multimodal locomotion capability.
|
| |
| MoBT9 Regular session, 142ABC |
Add to My Program |
| Motion and Path Planning II |
|
| |
| Chair: Draelos, Mark | University of Michigan |
| Co-Chair: Hollinger, Geoffrey | Oregon State University |
| |
| 14:00-14:06, Paper MoBT9.1 | Add to My Program |
| Time-Optimal Spiral Trajectories with Closed-Form Solutions |
|
| Draelos, Mark | University of Michigan |
Keywords: Motion and Path Planning, Dynamics
Abstract: The Archimedean spiral is space-filling plane curve that is found in applications ranging from coverage path planning for robot exploration to scan pattern generation for medical imaging. The constant linear velocity (CLV) parameterization of this spiral is of particular interest due to its fixed path velocity and isotropic sampling capability, but the high accelerations near its origin singularity yield poor trajectory tracking that limit its utility. Here, I derive a closed-form time-optimal time scaling for CLV spirals with large path velocities that mitigates the singularity by inspecting the CLV spiral's acceleration envelope. When applied to two degree-of-freedom Cartesian scanner, I demonstrate that this approach reduces trajectory tracking error by up to 97.1% as compared to naive CLV spirals with low computational overhead. I further show that this time scaling eliminates the central image distortion near the origin for scanning applications that rely on CLV spirals.
|
| |
| 14:06-14:12, Paper MoBT9.2 | Add to My Program |
| Optimal Path Planning through a Sequence of Waypoints |
|
| Goutham, Mithun | Ohio State University |
| Boyle, Stephen | Ohio State University |
| Menon, Meghna | Ford Motor Company |
| Mohan, Shankar | Ford |
| Garrow, Sarah | Ford Motor Company |
| Stockar, Stephanie | Ohio State University |
Keywords: Motion and Path Planning, Intelligent and Flexible Manufacturing, Industrial Robots
Abstract: This paper presents a deterministic approach for finding the optimal path through a sequence of spatial waypoints while accounting for vertex or turn costs. A case study is presented where the proposed algorithm is used to determine the optimal path through a sequence of waypoints. This is then compared with the path obtained when considering only two consecutive waypoints at a time. Further, an approximation that uses three waypoints at a time in a staggered manner is described. This approach is shown to be computationally efficient and finds the optimal path in a case study with 2000 waypoints.
|
| |
| 14:12-14:18, Paper MoBT9.3 | Add to My Program |
| Efficient Path Planning in Manipulation Planning Problems by Actively Reusing Validation Effort |
|
| Hartmann, Valentin Noah | University of Stuttgart |
| Ortiz-Haro, Joaquim | University of Stuttgart |
| Toussaint, Marc | TU Berlin |
Keywords: Motion and Path Planning, Manipulation Planning
Abstract: The path planning problems arising in manipulation planning and in task and motion planning settings are typically repetitive: the same manipulator moves in a space that only changes slightly. Despite this potential for reuse of information, few planners fully exploit the available information. To better enable this reuse, we decompose the collision checking into reusable, and non-reusable parts. We then treat the sequences of path planning problems in manipulation planning as a multiquery path planning problem. This allows the usage of planners that actively minimize planning effort over multiple queries, and by doing so, actively reuse previous knowledge. We implement this approach in EIRM* and effort ordered LazyPRM*, and benchmark it on multiple simulated robotic examples. Further, we show that the approach of decomposing collision checks additionally enables the reuse of the gained knowledge over multiple different instances of the same problem, i.e., in a multiquery manipulation planning scenario. The planners using the decomposed collision checking outperform the other planners in initial solution time by up to a factor of two while providing a similar solution quality.
|
| |
| 14:18-14:24, Paper MoBT9.4 | Add to My Program |
| Improving Reliable Navigation under Uncertainty Via Predictions Informed by Non-Local Information |
|
| Arnob, Raihan Islam | George Mason University |
| Stein, Gregory | George Mason University |
Keywords: Motion and Path Planning, Autonomous Agents, AI-Enabled Robotics
Abstract: We improve reliable, long-horizon, goal-directed navigation in partially-mapped environments by using nonlocally available information to predict the goodness of temporally-extended actions that enter unseen space. Making predictions about where to navigate in general requires nonlocal information: any observations the robot has seen so far may provide information about the goodness of a particular direction of travel. Building on recent work in learning augmented model-based planning under uncertainty, we present an approach that can both rely on non-local information to make predictions (via a graph neural network) and is reliable by design: it will always reach its goal, even when learning does not provide accurate predictions. We conduct experiments in three simulated environments in which non-local information is needed to perform well. In our large scale university building environment, generated from real-world floorplans to the scale, we demonstrate a 9.3% reduction in cost-to-go compared to a non-learned baseline and a 14.9% reduction compared to a learning-informed planner that can only use local information to inform its predictions.
|
| |
| 14:24-14:30, Paper MoBT9.5 | Add to My Program |
| TOP-UAV: Open-Source Time-Optimal Trajectory Planner for Point-Masses under Acceleration and Velocity Constraints |
|
| Meyer, Fabian | FZI Forschungszentrum Informatik |
| Glock, Katharina | FZI Forschungszentrum Informatik |
| Sayah, David | FZI Forschungszentrum Informatik |
Keywords: Motion and Path Planning, Optimization and Optimal Control, Kinematics
Abstract: In the latest research for unmanned aerial vehicles (UAVs), time-optimal trajectory planning of a point-mass with acceleration as control input and constrained maximum velocity (TOT-PMAV) has proved to be very promising for UAV behavior planning. They can be calculated within microseconds and tracked with high precision by modern trajectory tracking controllers like model predictive control (MPC). However, recent research shows that the state-of-the-art (SOTA) approach to generate these time-optimal trajectories is based on an invalid method to synchronize the coordinate axes which sometimes yields trajectories that miss the desired final state by far. Hence, an alternative approach was proposed that claims to resolve this issue. However, it still needs mathematical proof of its correctness. In this work, we provide the missing proof and mathematically demonstrate the problems arising from the SOTA approach. Further, since neither the SOTA nor the alternative solution approach utilizes the full kinematic capacity of a UAV, we propose an improved solution approach to the TOT-PMAV that better exploits kinematic properties and yields, on average, up to 14% faster trajectories. We substantiate our findings with an extensive computational study, show in which situations the SOTA is likely to fail and provide metrics to measure the consequence during failure. To enable reproducibility, our code is open-source.
|
| |
| 14:30-14:36, Paper MoBT9.6 | Add to My Program |
| Fast Asymptotically Optimal Path Planning in Dynamic, Uncertain Environments |
|
| Huang, Lu | City University of Hongkong |
| Jing, Xingjian | City University of Hong Kong |
Keywords: Motion and Path Planning
Abstract: This paper presents Fast Adaptive Tree (FAT), an asymptotically-optimal sampling-based path planner for dynamic and uncertain scenarios. Namely, the solution extracted converges to the optimal solution given the sensor information as the number of samples approaches infinity. The planner maintains an underlying graph, which increasingly approximates the search domain, and a dynamic spanning tree of the graph, which contains the shortest path from the start to the goal state. The planner quickly responds to the availability of new information about the environments or the robot movements by minimally repairing the spanning tree over the navigation. The simulation results show that the proposed path planner achieves higher efficiency of replanning than several state-of-the-art path planners without sacrificing solution quality.
|
| |
| 14:36-14:42, Paper MoBT9.7 | Add to My Program |
| An Efficient Trajectory Planner for Car-Like Robots on Uneven Terrain |
|
| Xu, Long | Zhejiang University |
| Chai, Kaixin | Xi'an Jiaotong University |
| Han, Zhichao | Zhejiang University |
| Liu, Hong | Hangzhou City University |
| Xu, Chao | Zhejiang University |
| Cao, Yanjun | Zhejiang University, Huzhou Institute of Zhejiang University |
| Gao, Fei | Zhejiang University |
Keywords: Motion and Path Planning, Nonholonomic Motion Planning, Autonomous Vehicle Navigation
Abstract: Autonomous navigation of ground robots on uneven terrain is being considered in more and more tasks. However, uneven terrain will bring two problems to motion planning: how to assess the traversability of the terrain and how to cope with the dynamics model of the robot associated with the terrain. The trajectories generated by existing methods are often too conservative or cannot be tracked well by the controller since the second problem is not well solved. In this paper, we propose terrain pose mapping to describe the impact of terrain on the robot. With this mapping, we can obtain the SE(3) state of the robot on uneven terrain for a given state in SE(2). Then, based on it, we present a trajectory optimization framework for car-like robots on uneven terrain that can consider both of the above problems. The trajectories generated by our method conform to the dynamics model of the system without being overly conservative and yet able to be tracked well by the controller. We perform simulations and real-world experiments to validate the efficiency and trajectory quality of our algorithm.
|
| |
| 14:42-14:48, Paper MoBT9.8 | Add to My Program |
| Robots As AI Double Agents: Privacy in Motion Planning |
|
| Shome, Rahul | The Australian National University |
| Kingston, Zachary | Rice University |
| Kavraki, Lydia | Rice University |
Keywords: Motion and Path Planning
Abstract: Robotics and automation are poised to change the landscape of home and work in the near future. Robots are adept at deliberately moving, sensing, and interacting with their environments. The pervasive use of this technology promises societal and economic payoffs due to its capabilities---conversely, the capabilities of robots to move within and sense the world around them is susceptible to abuse. Robots, unlike typical sensors, are inherently autonomous, active, and deliberate. Such automated agents can become AI double agents liable to violate the privacy of coworkers, privileged spaces, and other stakeholders. In this work we highlight the understudied and inevitable threats to privacy that can be posed by the autonomous, deliberate motions and sensing of robots. We frame the problem within broader sociotechnological questions alongside a comprehensive review. The privacy-aware motion planning problem is formulated in terms of cost functions that can be modified to induce privacy-aware behavior---preserving, agnostic, or violating. Simulated case studies in manipulation and navigation, with altered cost functions, are used to demonstrate how privacy-violating threats can be easily injected, sometimes with only small changes in performance (solution path lengths). Such functionality is already widely available. This preliminary work is meant to lay the foundations for near-future, holistic, interdisciplinary investigations that can address questions surrounding privacy in intelligent robotic behaviors determined by planning algorithms.
|
| |
| 14:48-14:54, Paper MoBT9.9 | Add to My Program |
| Bang-Bang Boosting of RRTs |
|
| LaValle, Alexander J. | University of Oulu |
| Sakcak, Basak | University of Oulu |
| LaValle, Steven M | University of Oulu |
Keywords: Motion and Path Planning
Abstract: This paper presents methods for dramatically improving the performance of sampling-based kinodynamic planners. The key component is a complete, exact steering method that produces a time-optimal trajectory between any states for a vector of synchronized double integrators. This method is applied in three ways: 1) to generate RRT edges that quickly solve the two-point boundary-value problems, 2) to produce a (quasi)metric for more accurate Voronoi bias in RRTs, and 3) to iteratively time-optimize a given collision-free trajectory. Experiments are performed for state spaces with up to 2000 dimensions, resulting in improved computed trajectories and orders of magnitude computation time improvements over using ordinary metrics and constant controls.
|
| |
| 14:54-15:00, Paper MoBT9.10 | Add to My Program |
| Geometric Gait Optimization for Inertia-Dominated Systems with Nonzero Net Momentum |
|
| Yang, Yanhao | Oregon State University |
| Hatton, Ross | Oregon State University |
Keywords: Nonholonomic Motion Planning, Motion and Path Planning, Nonholonomic Mechanisms and Systems
Abstract: Inertia-dominated mechanical systems can achieve net displacement by 1) periodically changing their shape (known as kinematic gait) and 2) adjusting their inertia distribution to utilize the existing nonzero net momentum (known as momentum gait). Therefore, finding the gait that most effectively utilizes the two types of locomotion in terms of the magnitude of the net momentum is a significant topic in the study of locomotion. For kinematic locomotion with zero net momentum, the geometry of optimal gaits is expressed as the equilibria of system constraint curvature flux through the surface bounded by the gait, and the cost associated with executing the gait in the metric space. In this paper, we identify the geometry of optimal gaits with nonzero net momentum effects by lifting the gait description to a time-parameterized curve in shape-time space. We also propose the variational gait optimization algorithm corresponding to the lifted geometric structure, and identify two distinct patterns in the optimal motion, determined by whether or not the kinematic and momentum gaits are concentric. The examples of systems with and without fluid-added mass demonstrate that the proposed algorithm can efficiently solve forward and turning locomotion gaits in the presence of nonzero net momentum. At any given momentum and effort limit, the proposed optimal gait that takes into account both momentum and kinematic effects outperforms the reference gaits that each only considers one of these effects.
|
| |
| 15:00-15:06, Paper MoBT9.11 | Add to My Program |
| Real-Time Tube-Based Non-Gaussian Risk Bounded Motion Planning for Stochastic Nonlinear Systems in Uncertain Environments Via Motion Primitives |
|
| Han, Weiqiao | Massachusetts Institute of Technology |
| M. Jasour, Ashkan | MIT |
| Williams, Brian | MIT |
Keywords: Motion and Path Planning, Autonomous Vehicle Navigation, Probability and Statistical Methods
Abstract: We consider the motion planning problem for stochastic nonlinear systems in uncertain environments. More precisely, in this problem the robot has stochastic nonlinear dynamics and uncertain initial locations, and the environment contains multiple dynamic uncertain obstacles. Obstacles can be of arbitrary shape, can deform, and can move. All uncertainties do not necessarily have Gaussian distribution. This general setting has been considered and solved in [1]. In addition to the assumptions above, in this paper, we consider long-term tasks, where the planning method in [1] would fail, as the uncertainty of the system states grows too large over a long time horizon. Unlike [1], we present a real-time online motion planning algorithm. We build discrete-time motion primitives and their corresponding continuous-time tubes offline, so that almost all system states of each motion primitive are guaranteed to stay inside the corresponding tube. We convert probabilistic safety constraints into a set of deterministic constraints called risk contours. During online execution, we verify the safety of the tubes against deterministic risk contours using sum-of-squares (SOS) programming. The provided SOS-based method verifies the safety of the tube in the presence of uncertain obstacles without the need for uncertainty samples and time discretization in real-time. By bounding the probability the system states staying inside the tube and bounding the probability of the tube colliding with obstacles, our approach guarantees bounded probability of system states colliding with obstacles. We demonstrate our approach on several long-term robotics tasks.
|
| |
| 15:06-15:12, Paper MoBT9.12 | Add to My Program |
| Parallelized Control-Aware Motion Planning with Learned Controller Proxies |
|
| Chow, Scott | Oregon State University |
| Chang, Dongsik | Amazon |
| Hollinger, Geoffrey | Oregon State University |
Keywords: Motion and Path Planning, Integrated Planning and Control, Integrated Planning and Learning
Abstract: Kinodynamic motion planning enables autonomous robots to find efficient paths while minimizing energy expenditure and avoiding hazards in the environment. However, during plan execution, the controller may deviate from the collision-free path found by the planner due to discrepancies between planning and control, causing inaccurate estimation of path costs and potentially collisions with obstacles. While this can be mitigated by incorporating the vehicle controller into planning, these approaches are generally bottlenecked by the high computation cost of simulating the vehicle dynamics and controller. This paper presents the Parallel Closed-Loop RRT* motion planner that uses a fast neural network controller as a substitute for a computationally-demanding controller during planning. Using a neural network controller and parallelizing the planning process makes closed-loop planning tractable for vehicles with nonlinear dynamics and significantly reduces planning time. Experiments on a simulated underwater vehicle with a model predictive controller demonstrate that our approach yields feasible plans that are more likely to be successfully executed without collisions compared to planners that do not consider the controller.
|
| |
| 15:12-15:18, Paper MoBT9.13 | Add to My Program |
| Improvement of Submodular Maximization Problems with Routing Constraints Via Submodularity and Fourier Sparsity |
|
| Lin, Pao-Te | National Central University |
| Tseng, Kuo-Shih | National Central University |
Keywords: Mapping, Search and Rescue Robots, Motion and Path Planning
Abstract: Various robotic problems (e.g., map exploration, environmental monitoring and spatial search) can be formulated as submodular maximization problems with routing constraints. These problems involve two NP-hard problems, maximal coverage and traveling salesman problems. The generalized cost-benefit algorithm (GCB) is able to solve this problem with a frac{1}{2}(1-frac{1}{e})widetilde{OPT} guarantee, where widetilde{OPT} is the approximation of optimal performance. There is a gap between the widetilde{OPT} and the optimal solution (OPT). In this research, the proposed algorithms, Tree-Structured Fourier Supports Set (TS-FSS), utilize the submodularity and sparsity of routing trees to boost GCB performance. The theorems show that the proposed algorithms have a higher optimum bound than GCB. The experiments demonstrate that the proposed approach outperforms benchmark approaches.
|
| |
| MoBT10 Regular session, 250ABC |
Add to My Program |
| Learning for Manipulation II |
|
| |
| Chair: Ogata, Tetsuya | Waseda University |
| Co-Chair: Gupta, Satyandra K. | University of Southern California |
| |
| 14:00-14:06, Paper MoBT10.1 | Add to My Program |
| Learning Bifunctional Push-Grasping Synergistic Strategy for Goal-Agnostic and Goal-Oriented Tasks |
|
| Ren, Dafa | Shanghai University |
| Wu, Shuang | Huawei |
| Wang, Xiaofan | Shanghai University |
| Peng, Yan | Shanghai University |
| Ren, Xiaoqiang | Shanghai University |
Keywords: Grasping, Reinforcement Learning, Deep Learning in Grasping and Manipulation
Abstract: Both goal-agnostic and goal-oriented tasks have practical value for robotic grasping: goal-agnostic tasks target all objects in the workspace, while goal-oriented tasks aim at grasping pre-assigned goal objects. However, most current grasping methods are only better at coping with one task. In this work, we propose a bifunctional push-grasping synergistic strategy for goal-agnostic and goal-oriented grasping tasks. Our method integrates pushing along with grasping to pick up all objects or pre-assigned goal objects with high action efficiency depending on the task requirement. We introduce a bifunctional network, which takes in visual observations and outputs dense pixel-wise maps of Q values for pushing and grasping primitive actions, to increase the available samples in the action space. Then we propose a hierarchical reinforcement learning framework to coordinate the two tasks by considering the goal-agnostic task as a combination of multiple goal-oriented tasks. To reduce the training difficulty of the hierarchical framework, we design a two-stage training method to train the two types of tasks separately. We perform pre-training of the model in simulation, and then transfer the learned model to the real world without any additional real-world fine-tuning. Experimental results show that the proposed approach outperforms existing methods in task completion rate and grasp success rate with less motion number. Supplementary material is available at https://github.com/DafaRen/Learning_Bifunctional_Push-grasping_Synergistic_Strategy_for_Goal-agnostic_and_Goal-oriented_Tasks
|
| |
| 14:06-14:12, Paper MoBT10.2 | Add to My Program |
| Visual Spatial Attention and Proprioceptive Data-Driven Reinforcement Learning for Robust Peg-In-Hole Task under Variable Conditions |
|
| Yasutomi, Andr� Yuji | Hitachi Ltd |
| Ichiwara, Hideyuki | Hitachi, Ltd. / Waseda University |
| Ito, Hiroshi | Hitachi, Ltd |
| Mori, Hiroki | Waseda University |
| Ogata, Tetsuya | Waseda University |
Keywords: Robotics and Automation in Construction, Reinforcement Learning, Deep Learning for Visual Perception
Abstract: Anchor-bolt insertion is a peg-in-hole task performed in the construction field for holes in concrete. Efforts have been made to automate this task, but the variable lighting and hole surface conditions, as well as the requirements for short setup and task execution time make the automation challenging. In this study, we introduce a vision and kinesthetic data-driven robot control model for this task that is robust to challenging lighting and hole surface conditions. This model consists of a spatial attention point network (SAP) and a deep reinforcement learning (DRL) policy that are trained jointly end-to-end to control the robot. The model is trained in an offline manner, with a sample-efficient framework designed to reduce training time and minimize the reality gap when transferring the model to the physical world. Through evaluations with an industrial robot performing the task in 12 unknown holes, starting from 16 different initial positions, and under three different lighting conditions (two with misleading shadows), we demonstrate that SAP can generate relevant attention points of the image even in challenging lighting conditions. We also show that the proposed model enables task execution with higher success rate and shorter task completion time than various baselines. Due to the proposed model's high effectiveness even in severe lighting, initial positions, and hole conditions, and the offline training framework's high sample-efficiency and short training time, this approach can be easily applied to construction.
|
| |
| 14:12-14:18, Paper MoBT10.3 | Add to My Program |
| Domain Adaptation on Point Clouds for 6D Pose Estimation in Bin-Picking Scenarios |
|
| Zhao, Liang | Tsinghua University |
| Sun, Meng | Tsinghua University |
| Lv, Weijie | Tsinghua University |
| Zhang, Xinyu | Tsinghua University |
| Zeng, Long | Tsinghua University |
Keywords: Computer Vision for Manufacturing, Transfer Learning, Deep Learning in Grasping and Manipulation
Abstract: Training with simulated data is a common approach in pose estimation research. However, a sim-to-real gap between clean simulated data and noisy real data will seriously weaken the generalization ability of the algorithm, especially for point clouds. To address this problem, this paper proposes a domain adaptive pose estimation network (DAPE-Net). For the feature extracted from the backbone, the network will conduct the real and simulation discrimination based on a feature discriminator, and complete the pose estimation by adversarial training. This makes the network pay more attention to the domain invariant features of simulation and real point clouds to complete domain adaptation. In our experiment, DAPE-Net improved the performance of pose estimation by 10%. To solve the problem that domain adaptation requires a small amount of real data, we propose a scheme that can semi-automatically collect real data in bin-picking scenarios for 6D pose estimation.
|
| |
| 14:18-14:24, Paper MoBT10.4 | Add to My Program |
| Learning Robotic Powder Weighing from Simulation for Laboratory Automation |
|
| Kadokawa, Yuki | Nara Institute of Science and Technology |
| Hamaya, Masashi | OMRON SINIC X Corporation |
| Tanaka, Kazutoshi | OMRON SINIC X Corporation |
Keywords: Robotics and Automation in Life Sciences
Abstract: This study focuses on a robotic powder weighing task used in laboratory automation. In this task, a robot weighs a certain amount of powder with a milligram-level target mass using a dispensing spoon. The complex dynamics of the powder, the variations in the materials being weighed, and the need to balance conservative and aggressive actions are significant challenges in the robotics field. Therefore, learning approaches are critical for this task. However, many learning interactions in real-world environments require substantial efforts to clean the spread powder. To overcome this issue, this study employs a sim-to-real transfer learning approach using a domain randomization (DR) technique. This enables the robot to weigh various powders with a small target mass and alleviates the burden of collecting data in a real-world environment. Herein, we formulated weighing manipulation as a reinforcement learning problem. Besides, we developed a powder weighing simulator and carefully selected the dynamics parameters used for DR to adapt to unseen environments. A recurrent neural network-based policy was adopted considering the balance of conservative and aggressive actions. The sim-to-real zero-shot transfer experiments demonstrated that the robot completed the weighing tasks with an average weighing error of 0.1 -- 0.2 mg for different powder materials and target masses (5 -- 15 mg). Overall, this approach shows promising results and can be useful for automating laboratory tasks that involve weighing powders.
|
| |
| 14:24-14:30, Paper MoBT10.5 | Add to My Program |
| Constrained Generative Sampling of 6-DoF Grasps |
|
| Lundell, Jens | Royal Institute of Technology |
| Verdoja, Francesco | Aalto University |
| Nguyen Le, Tran | Aalto University |
| Mousavian, Arsalan | NVIDIA |
| Fox, Dieter | University of Washington |
| Kyrki, Ville | Aalto University |
Keywords: Deep Learning in Grasping and Manipulation, Grasping
Abstract: Most state-of-the-art data-driven grasp sampling methods propose stable and collision-free grasps uniformly on the target object. For bin-picking, executing any of those reachable grasps is sufficient. However, for completing specific tasks, such as squeezing out liquid from a bottle, we want the grasp to be on a specific part of the object�s body while avoiding other locations, such as the cap. This work presents a generative grasp sampling network, VCGS, capable of constrained 6- Degrees of Freedom (DoF) grasp sampling. In addition, we also curate a new dataset designed to train and evaluate methods for constrained grasping. The new dataset, called CONG, consists of over 14 million training samples of synthetically rendered point clouds and grasps at random target areas on 2889 objects. VCGS is benchmarked against GraspNet, a state-of-the-art unconstrained grasp sampler, in simulation and on a real robot. The results demonstrate that VCGS achieves a 10�15% higher grasp success rate than the baseline while being 2�3 times as sample efficient. Supplementary material is available on our project website.
|
| |
| 14:30-14:36, Paper MoBT10.6 | Add to My Program |
| RGBD Fusion Grasp Network with Large-Scale Tableware Grasp Dataset |
|
| Yoon, Jaemin | Samsung Research |
| Ahn, Joonmo | Samsung Electronics |
| Ha, Changsu | Samsung Electronics |
| Chung, Rakjoon | Samsung Electronics |
| Park, Dongwoo | Samsung Electronics |
| Han, Heungwoo | Samsung Research |
| Kang, Sung-Chul | Samsung Research, Samsung Electronics |
Keywords: Deep Learning in Grasping and Manipulation, Data Sets for Robot Learning, Grasping
Abstract: This paper proposes a novel approach to address the technical challenges of stable object grasping, particularly in the context of handling tableware in a home environment. Handling tableware is particularly important, yet challenging, due to the flat nature of most tableware objects and the need to maintain a stable posture to prevent spills. To address these challenges, we present three key contributions: 1) a large-scale tableware dataset, not commonly found in the previous datasets; 2) a novel sampling method for stable grasp pose generation; and 3) a multi-modal fusion grasp network that effectively learns 6-DoF grasp pose, including flat objects. Our dataset contains over 45 million grasp poses and 1 million RGBD images captured in 800 scenes, which include randomly selected 10-18 tableware objects under 4 different lighting conditions. The grasp poses in the dataset are generated using a novel sampling method that incorporates geometric analysis to ensure stable grasping with minimal object movement. Furthermore, we design an RGBD fusion grasp network (RGBD-FGN) that can combine information from RGB and depth images considering each characteristic. Our experimental results demonstrate the superior performance of our approach over existing techniques, which is a significant contribution towards developing a multitasking home robot. Our dataset and source code can be accessed at https://github.com/SamsungLabs/RGBD-FGN.
|
| |
| 14:36-14:42, Paper MoBT10.7 | Add to My Program |
| One-Shot Affordance Learning (OSAL): Learning to Manipulate Articulated Objects by Observing Once |
|
| Fan, Ruomeng | The University of Tokyo |
| Wang, Taohan | The University of Tokyo School of Engineering |
| Hirano, Masahiro | The University of Tokyo |
| Yamakawa, Yuji | The University of Tokyo |
Keywords: Learning from Demonstration, Deep Learning in Grasping and Manipulation
Abstract: We present One-Shot Affordance Learning (OSAL): a unified pipeline that learns manipulation for articulated objects by observing human demonstration only once. The key idea of our method is to embody affordance of articulated objects with an open-loop trajectory conditioned on a certain area of the object's surface. It serves as a simplified object-centric manipulation representation, which can be easily transferred into robot motion, while traditional methods fail to deal with the configuration difference between human hands and robot end effectors. Our system extracts the embodied affordance by focusing on hand action's effect on the object, and further grounds such affordance into object visual features through self-supervised learning for novel object configurations. We evaluated our method on a collection of real-life objects and furniture and demonstrated high success rates. With our system, humans only need to manipulate a novel object once with any gesture to transfer that manipulation skill to the robot, which we believe to be a highly efficient and user-friendly paradigm oriented for future real-life robots.
|
| |
| 14:42-14:48, Paper MoBT10.8 | Add to My Program |
| EARL: Eye-On-Hand Reinforcement Learner for Dynamic Grasping with Active Pose Estimation |
|
| Huang, Baichuan | Rutgers University |
| Yu, Jingjin | Rutgers University |
| Jain, Siddarth | Mitsubishi Electric Research Laboratories (MERL) |
Keywords: Grasping, Perception for Grasping and Manipulation, Reinforcement Learning
Abstract: We explore the dynamic grasping of moving objects through active pose tracking and reinforcement learning for hand-eye coordination systems. Most existing vision-based robotic grasping methods implicitly assume target objects are stationary or moving predictably. Performing grasping of unpredictably moving objects presents a unique set of challenges. For example, a pre-computed robust grasp can become unreachable or unstable as the target object moves, and motion planning must also be adaptive. In this work, we present a new approach, Eye-on-hAnd Reinforcement Learner (EARL), for enabling coupled Eye-on-Hand (EoH) robotic manipulation systems to perform real-time active pose tracking and dynamic grasping of novel objects without explicit motion prediction. EARL readily addresses many thorny issues in automated hand-eye coordination, including fast-tracking of 6D object pose from vision, learning control policy for a robotic arm to track a moving object while keeping the object in the camera�s field of view, and performing dynamic grasping. We demonstrate the effectiveness of our approach in extensive experiments validated on multiple commercial robotic arms in both simulations and complex real-world tasks.
|
| |
| 14:48-14:54, Paper MoBT10.9 | Add to My Program |
| KGNv2: Separating Scale and Pose Prediction for Keypoint-Based Grasp Synthesis on RGB-D Input |
|
| Chen, Yiye | Georgia Institute of Technology |
| Xu, Ruinian | Georgia Institute of Technology |
| Lin, Yunzhi | Georgia Institute of Technology |
| Chen, Hongyi | Georgia Institute of Technology |
| Vela, Patricio | Georgia Institute of Technology |
Keywords: Deep Learning in Grasping and Manipulation, Perception for Grasping and Manipulation, Grasping
Abstract: We propose an improved keypoint approach for 6-DoF grasp pose synthesis from RGB-D input. Keypoint-based grasp detection from image input demonstrated promising results in a previous study, where the visual information provided by color imagery compensates for noisy or imprecise depth measurements. However, it relies heavily on accurate keypoint prediction in image space. We devise a new grasp generation network that reduces the dependency on precise keypoint estimation. Given an RGB-D input, the network estimates both the grasp pose and the camera-grasp length scale. Re-design of the keypoint output space mitigates the impact of keypoint prediction noise on Perspective-n-Point (P nP) algorithm solutions. Experiments show that the proposed method outperforms the baseline by a large margin, validating its design. Though trained only on simple synthetic objects, our method demonstrates sim-to-real capacity through competitive results in real-world robot experiments.
|
| |
| 14:54-15:00, Paper MoBT10.10 | Add to My Program |
| Learning-Based Real-Time Torque Prediction for Grasping Unknown Objects with a Multi-Fingered Hand |
|
| Winkelbauer, Dominik | DLR |
| B�uml, Berthold | German Aerospace Center (DLR) |
| Triebel, Rudolph | German Aerospace Center (DLR) |
Keywords: Deep Learning in Grasping and Manipulation, Grasping, Multifingered Hands
Abstract: When grasping objects with a multi-finger hand, it is crucial for the grasp stability to apply the correct torques at each joint so that external forces are countered. Most current systems use simple heuristics instead of modeling the required torque correctly. Instead, we propose a learning-based approach that is able to predict torques for grasps on unknown objects in real-time. The neural network, trained end-to-end using supervised learning, is shown to predict torques that are more efficient, and the objects are held with less involuntary movement compared to all tested heuristic baselines. Specifically, for 90 % of the grasps the translational deviation of the object is below 2.9 mm and the rotational below 3.1�. To generate training data, we formulate the analytical computation of torques as an optimization problem and handle the indeterminacy of multi-contacts using an elastic model. We further show that the network generalizes to predict torques for unknown objects on the real robot system with an inference time of 1.5 ms.
|
| |
| 15:00-15:06, Paper MoBT10.11 | Add to My Program |
| A Grasp Pose Is All You Need: Learning Multi-Fingered Grasping with Deep Reinforcement Learning from Vision and Touch |
|
| Ceola, Federico | Istituto Italiano Di Tecnologia |
| Maiettini, Elisa | Humanoid Sensing and Perception, Istituto Italiano Di Tecnologia |
| Rosasco, Lorenzo | Istituto Italiano Di Tecnologia & MassachusettsInstitute OfTechn |
| Natale, Lorenzo | Istituto Italiano Di Tecnologia |
Keywords: Grasping, Reinforcement Learning, Humanoid Robot Systems
Abstract: Multi-fingered robotic hands have potential to enable robots to perform sophisticated manipulation tasks. However, teaching a robot to grasp objects with an anthropomorphic hand is an arduous problem due to the high dimensionality of state and action spaces. Deep Reinforcement Learning (DRL) offers techniques to design control policies for this kind of problems without explicit environment or hand modeling. However, state-of-the-art model-free algorithms have proven inefficient for learning such policies. The main problem is that the exploration of the environment is unfeasible for such high-dimensional problems, thus hampering the initial phases of policy optimization. One possibility to address this is to rely on off-line task demonstrations, but, oftentimes, this is too demanding in terms of time and computational resources. To address these problems, we propose the A Grasp Pose is All You Need (G-PAYN) method for the anthropomorphic hand of the iCub humanoid. We develop an approach to automatically collect task demonstrations to initialize the training of the policy. The proposed grasping pipeline starts from a grasp pose generated by an external algorithm, used to initiate the movement. Then a control policy (previously trained with the proposed G-PAYN) is used to reach and grab the object. We deployed the iCub into the MuJoCo simulator and use it to test our approach with objects from the YCB-Video dataset. Results show that G-PAYN outperforms current DRL techniques in the considered setting in terms of success rate and execution time with respect to the baselines. The code to reproduce the experiments is released together with the paper with an open source license.
|
| |
| 15:06-15:12, Paper MoBT10.12 | Add to My Program |
| Physics-Informed Learning to Enable Robotic Screw-Driving under Hole Pose Uncertainties |
|
| Manyar, Omey Mohan | University of Southern California |
| Varadanahalli Narayan, Santosh | University of Southern California |
| Lengade, Rohin | University of Southern California |
| Gupta, Satyandra K. | University of Southern California |
Keywords: Learning Categories and Concepts, Compliance and Impedance Control, Industrial Robots
Abstract: Screw-driving is an important operation in numerous applications. In many situations, hole pose cannot be estimated very accurately. Autonomous screw-driving cannot be performed by traditional industrial manipulators in position control mode when the hole pose uncertainty is high. This paper presents a mobile manipulator system for performing autonomous screw-driving in the presence of uncertainties in the hole estimates. It utilizes active compliance in the form of impedance control of the robot and passive compliance in the screwing driving tool to deal with uncertainties. We present a physics-informed machine learning approach to automatically characterize the motion of the screw tip and explain how this motion leads to successful operation in the presence of uncertainty. We also present an approach for detecting failure modes and taking corrective actions. Code and video is available at: https://sites.google.com/usc.edu/physicsinformedscrewdriving
|
| |
| MoBT11 Regular session, 251ABC |
Add to My Program |
| Aerial Systems - Applications II |
|
| |
| Chair: Vamvoudakis, Kyriakos G. | Georgia Inst. of Tech |
| Co-Chair: Alexis, Kostas | NTNU - Norwegian University of Science and Technology |
| |
| 14:00-14:06, Paper MoBT11.1 | Add to My Program |
| Viewpoint-Driven Formation Control of Airships for Cooperative Target Tracking |
|
| Price, Eric | Universit�t Stuttgart |
| Black, Michael | Max Planck Institute for Intelligent Systems in T�bingen |
| Ahmad, Aamir | University of Stuttgart |
Keywords: Aerial Systems: Perception and Autonomy, Path Planning for Multiple Mobile Robots or Agents, Multi-Robot Systems
Abstract: For tracking and motion capture (MoCap) of animals in their natural habitat, a formation of safe and silent aerial platforms, such as airships with on-board cameras, is well suited. In our prior work we derived formation properties for optimal MoCap, which include maintaining constant angular separation between observers w.r.t. the subject, threshold distance to it and keeping it centered in the camera view. Unlike multi-rotors, airships have non-holonomic constrains and are affected by ambient wind. Their orientation and flight direction are also tightly coupled. Therefore a control scheme for multicopters that assumes independence of motion direction and orientation is not applicable. In this paper, we address this problem by first exploiting a periodic relationship between the airspeed of an airship and its distance to the subject. We use it to derive analytical and numeric solutions that satisfy the formation properties for optimal MoCap. Based on this, we developed an MPC-based formation controller. We performed theoretical analysis of our solution, boundary conditions of its applicability, extensive simulation experiments and a real world demonstration of our control method with an unmanned airship. Open source code https://tinyurl.com/AsMPCCode and video of our method is provided https://tinyurl.com/AsMPCVid .
|
| |
| 14:06-14:12, Paper MoBT11.2 | Add to My Program |
| ADMNet: Anti-Drone Real-Time Detection and Monitoring |
|
| Zhou, Xunkuai | Tongji University |
| Yang, Guidong | The Chinese University of Hong Kong |
| Chen, Yizhou | Chinese University of Hong Kong |
| Gao, Chuanxiang | The Chinese University of Hong Kong |
| Zhao, Benyun | The Chinese University of Hong Kong |
| Li, Li | Tongji University |
| Chen, Ben M. | Chinese University of Hong Kong |
Keywords: Computer Vision for Automation, Industrial Robots, Object Detection, Segmentation and Categorization
Abstract: We propose a lightweight, effective, and efficient anti-drone network, namely ADMNet, for visually detecting and monitoring unfriendly drones with a constrained view field, flying against a complex environment. We merge an SPP module to the first head of YOLOv4 to improve accuracy and perform network compression to reduce inference latency and model size. To compensate for the accuracy loss caused by condensation, we propose an SPPS module and a ResNeck module for the neck of the network and implement an effective attention module for the backbone. Eventually, we present an accurate and compact ADMNet with barely 3.9 MB, ensuring low computational cost and real-time detection. Our method achieves state-of-the-art performance on three challenging real-world datasets (Average Precision @0.5IoU): Det-Fly 96.2%, NPS-Drones 92.0%, and TIBNet 89.7%. The throughput is higher than the prior work, in addition to its superior performance. The comparative testing in real-world scenarios proves that our method exhibits strong reliability and generalization ability. Deploying the network on drone onboard edge-computing devices enables real-time detection and monitoring of flying drones, highlighting the portability and viability of the ADMNet.
|
| |
| 14:12-14:18, Paper MoBT11.3 | Add to My Program |
| Multi-View Stereo with Learnable Cost Metric |
|
| Yang, Guidong | The Chinese University of Hong Kong |
| Zhou, Xunkuai | Tongji University |
| Gao, Chuanxiang | The Chinese University of Hong Kong |
| Zhao, Benyun | The Chinese University of Hong Kong |
| Zhang, Jihan | Chinese University of Hong Kong |
| Chen, Yizhou | Chinese University of Hong Kong |
| Chen, Xi | The Chinese University of Hong Kong |
| Chen, Ben M. | Chinese University of Hong Kong |
Keywords: Computer Vision for Automation, Aerial Systems: Applications, Deep Learning Methods
Abstract: In this paper, we present LCM-MVSNet, a novel multi-view stereo (MVS) network with learnable cost metric (LCM) for more accurate and complete depth estimation and dense point cloud reconstruction. To adapt to the scene variation and improve the reconstruction quality in non-Lambertian low-textured scenes, we propose LCM to adaptively aggregate multi-view matching similarity into the 3D cost volume by leveraging sparse points hints. The proposed LCM benefits the MVS approaches in four folds, including depth estimation enhancement, reconstruction quality improvement, memory footprint reduction, and computational burden alleviation, allowing the depth inference for high-resolution images to achieve more accurate and complete reconstruction. Moreover, we improve the depth estimation by enhancing the propagation of shallow features via a bottom-up path and strengthen the end-to-end supervision by adapting the focal loss to reduce ambiguity caused by sample imbalance. Extensive experiments on two benchmark datasets show that our network achieves state-of-the-art performance on the DTU dataset and exhibits strong generalization ability with a competitive performance on the Tanks and Temples benchmark. Furthermore, we deploy our LCM-MVSNet into the real-world application for large-scale 3D reconstruction based on multi-view aerial images collected by self-developed UAV, demonstrating the robustness and scalability of our method. More detailed results are available in the Appendix.
|
| |
| 14:18-14:24, Paper MoBT11.4 | Add to My Program |
| A Comparison between Framed-Based and Event-Based Cameras for Flapping-Wing Robot Perception |
|
| Tapia, Raul | University of Seville |
| Rodriguez-Gomez, Juan Pablo | University of Seville |
| S�nchez D�az, Juan Antonio | University of Seville |
| Ga��n, Francisco Javier | Universidad De Sevilla |
| Gutierrez Rodriguez, Ivan | University of Seville |
| Luna-Santamaria, Javier | University of Seville |
| Martinez-de Dios, J.R. | University of Seville |
| Ollero, Anibal | AICIA. G41099946 |
Keywords: Aerial Systems: Perception and Autonomy
Abstract: Perception systems for ornithopters face severe challenges. The harsh vibrations and abrupt movements caused during flapping are prone to produce motion blur and strong lighting condition changes. Their strict restrictions in weight, size, and energy consumption also limit the type and number of sensors to mount onboard. Lightweight traditional cameras have become a standard off-the-shelf solution in many flapping-wing designs. However, bioinspired event cameras are a promising solution for ornithopter perception due to their microsecond temporal resolution, high dynamic range, and low power consumption. This paper presents an experimental comparison between frame-based and an event-based camera. Both technologies are analyzed considering the particular flapping-wing robot specifications and also experimentally analyzing the performance of well-known vision algorithms with data recorded onboard a flapping-wing robot. Our results suggest event cameras as the most suitable sensors for ornithopters. Nevertheless, they also evidence the open challenges for event-based vision on board flapping-wing robots.
|
| |
| 14:24-14:30, Paper MoBT11.5 | Add to My Program |
| Flexible Multi-DoF Aerial 3D Printing Supported with Automated Optimal Chunking |
|
| Stamatopoulos, Marios-Nektarios | Lule� University of Technology |
| Banerjee, Avijit | Lule� University of Technology |
| Nikolakopoulos, George | Lule� University of Technology |
Keywords: Robotics and Automation in Construction, Additive Manufacturing
Abstract: The future of 3D printing utilizing unmanned aerial vehicles (UAVs) presents a promising capability to revolutionize manufacturing and to enable the creation of large-scale structures in remote and hard-to-reach areas e.g. in other planetary systems. Nevertheless, the limited payload capacity of UAVs and the complexity in the 3D printing of large objects pose significant challenges. In this article we propose a novel chunk-based framework for distributed 3D printing using UAVs that sets the basis for a fully collaborative aerial 3D printing of challenging structures. The presented framework, through a novel proposed optimisation process, is able to divide the 3D model to be printed into small, manageable chunks and to assign them to a UAV for partial printing of the assigned chunk, in a fully autonomous approach. Thus, we establish the algorithms for chunk division, allocation, and printing, and we also introduce a novel algorithm that efficiently partitions the mesh into planar chunks, while accounting for the inter-connectivity constraints of the chunks. The efficiency of the proposed framework is demonstrated through multiple physics based simulations in Gazebo, where a CAD construction mesh is printed via multiple UAVs carrying materials whose volume is proportionate to a fraction of the total mesh volume.
|
| |
| 14:30-14:36, Paper MoBT11.6 | Add to My Program |
| Memory Maps for Video Object Detection and Tracking on UAVs |
|
| Kiefer, Benjamin | University of Tuebingen |
| Quan, Yitong | University of Tuebingen |
| Zell, Andreas | University of T�bingen |
Keywords: Aerial Systems: Perception and Autonomy, Data Sets for Robotic Vision, Object Detection, Segmentation and Categorization
Abstract: This paper introduces a novel approach to video object detection detection and tracking on Unmanned Aerial Vehicles (UAVs). By incorporating metadata, the proposed approach creates a memory map of object locations in actual world coordinates, providing a more robust and interpretable representation of object locations in both, image space and the real world. We use this representation to boost confidences, resulting in improved performance for several temporal computer vision tasks, such as video object detection, short and long-term single and multi-object tracking, and video anomaly detection. These findings confirm the benefits of metadata in enhancing the capabilities of UAVs in the field of temporal computer vision and pave the way for further advancements in this area.
|
| |
| 14:36-14:42, Paper MoBT11.7 | Add to My Program |
| Robust Localization of Aerial Vehicles Via Active Control of Identical Ground Vehicles |
|
| Spasojevic, Igor | University of Pennsylvania |
| Liu, Xu | University of Pennsylvania |
| Prabhu, Ankit | University of Pennsylvania |
| Ribeiro, Alejandro | University of Pennsylvania |
| Pappas, George J. | University of Pennsylvania |
| Kumar, Vijay | University of Pennsylvania |
Keywords: Aerial Systems: Perception and Autonomy, Planning, Scheduling and Coordination, Localization
Abstract: This paper addresses the problem of active collaborative localization in heterogeneous robot teams with unknown data association. It involves positioning a small number of identical unmanned ground vehicles (UGVs) at desired positions so that an unmanned aerial vehicle (UAV) can, through unlabelled measurements of UGVs, uniquely determine its global pose. We model the problem as a sequential two player game, in which the first player positions the UGVs and the second identifies the two distinct hypothetical poses of the UAV at which the sets of measurements to the UGVs differ by as little as possible. We solve the underlying problem from the vantage point of the first player for a subclass of measurement models using a mixture of local optimization and exhaustive search procedures. Real-world experiments with a team of UAV and UGVs show that our method can achieve centimeter-level global localization accuracy. We also show that our method consistently outperforms random positioning of UGVs by a large margin, with as much as a 90% reduction in position and angular estimation error. Our method can tolerate a significant amount of random as well as non-stochastic measurement noise. This indicates its potential for reliable state estimation on board size, weight, and power (SWaP) constrained UAVs. This work enables robust localization in perceptually-challenged GPS-denied environments, thus paving the road for large-scale multi-robot navigation and mapping.
|
| |
| 14:42-14:48, Paper MoBT11.8 | Add to My Program |
| Semantically-Enhanced Deep Collision Prediction for Autonomous Navigation Using Aerial Robots |
|
| Kulkarni, Mihir | NTNU: Norwegian University of Science and Technology |
| Nguyen, Huan | NTNU - Norwegian University of Science and Technology |
| Alexis, Kostas | NTNU - Norwegian University of Science and Technology |
Keywords: Aerial Systems: Perception and Autonomy
Abstract: This paper contributes a novel and modularized learning-based method for aerial robots navigating cluttered environments containing hard-to-perceive thin obstacles without assuming access to a map or the full pose estimation of the robot. The proposed solution builds upon a semantically-enhanced Variational Autoencoder that is trained with both real-world and simulated depth images to compress the input data, while preserving semantically-labeled thin obstacles and handling invalid pixels in the depth sensor's output. This compressed representation, in addition to the robot's partial state involving its linear/angular velocities and its attitude are then utilized to train an uncertainty-aware 3D Collision Prediction Network in simulation to predict collision scores for candidate action sequences in a predefined motion primitives library. A set of simulation and experimental studies in cluttered environments with various sizes and types of obstacles, including multiple hard-to-perceive thin objects, were conducted to evaluate the performance of the proposed method and compare against an end-to-end trained baseline. The results demonstrate the benefits of the proposed semantically-enhanced deep collision prediction for learning-based autonomous navigation.
|
| |
| 14:48-14:54, Paper MoBT11.9 | Add to My Program |
| Demonstrating Autonomous 3D Path Planning on a Novel Scalable UGV-UAV Morphing Robot |
|
| Sihite, Eric | California Institute of Technology |
| Slezak, Filip | Caltech |
| Mandralis, Ioannis | Caltech |
| Salagame, Adarsh | Northeastern University |
| Ramezani, Milad | CSIRO |
| Kalantari, Arash | NASA JPL |
| Ramezani, Alireza | Northeastern University |
| Morteza, Gharib | CALTECH |
Keywords: Wheeled Robots, Aerial Systems: Applications, Motion and Path Planning
Abstract: Some animals exhibit multi-modal locomotion capability to traverse a wide range of terrains and environments, such as amphibians that can swim and walk or birds that can fly and walk. This capability is extremely beneficial for expanding the animal's habitat range and they can choose the most energy efficient mode of locomotion in a given environment. The robotic biomimicry of this multi-modal locomotion capability can be very challenging but offer the same advantages. However, the expanded range of locomotion also increases the complexity of performing localization and path planning. In this work, we present our morphing multi-modal robot, which is capable of ground and aerial locomotion, and the implementation of readily available SLAM and path planning solutions to navigate a complex indoor environment.
|
| |
| 14:54-15:00, Paper MoBT11.10 | Add to My Program |
| Topology-Guided Perception-Aware Receding Horizon Trajectory Generation for UAVs |
|
| Sun, Gang | Dalian University of Technology |
| Zhang, Xuetao | Dalian University of Technology |
| Liu, Yisha | Dalian Maritime University |
| Wang, Hanzhang | Dalian University of Technology |
| Zhang, Xuebo | Nankai University, |
| Zhuang, Yan | Dalian University of Technology |
Keywords: Motion and Path Planning, Aerial Systems: Applications, Autonomous Vehicle Navigation
Abstract: The perception-aware motion planning method based on the localization uncertainty has the potential to improve the localization accuracy for robot navigation. However, most of the existing perception-aware methods pre-build a global feature map and can not generate the perception-aware trajectory in real time. This paper proposes a topology-guided perception-aware receding horizon trajectory generation method, which contains a topology-guided position trajectory generation and a perception-aware yaw angle trajectory generation. Specifically, a memorable active map is built by selectively storing the visual landmarks. After that, a library of candidate topological trajectories are generated, which are then evaluated in terms of the perception quality based on the active map, smoothness, collision possibility and feasibility. In addition, the yaw angle trajectory is obtained through a front-end multiple refined path search and a back-end path-guided trajectory optimization. Comparative simulation and real-world experiments are carried out to confirm that the proposed method can keep more visual features in the view and reduce the localization error.
|
| |
| 15:00-15:06, Paper MoBT11.11 | Add to My Program |
| Learned Inertial Odometry for Autonomous Drone Racing |
|
| Cioffi, Giovanni | University of Zurich |
| Bauersfeld, Leonard | University of Zurich (UZH), |
| Kaufmann, Elia | University of Zurich |
| Scaramuzza, Davide | University of Zurich |
Keywords: Aerial Systems: Perception and Autonomy, Aerial Systems: Applications, Deep Learning Methods
Abstract: Inertial odometry is an attractive solution to the problem of state estimation for agile quadrotor flight. It is inexpensive, lightweight, and it is not affected by perceptual degradation. However, only relying on the integration of the inertial measurements for state estimation is infeasible. The errors and time-varying biases present in such measurements cause the accumulation of large drift in the pose estimates. Recently, inertial odometry has made significant progress in estimating the motion of pedestrians. State-of-the-art algorithms rely on learning a motion prior that is typical of humans but cannot be transferred to drones. In this work, we propose a learning-based odometry algorithm that uses an inertial measurement unit (IMU) as the only sensor modality for autonomous drone racing tasks. The core idea of our system is to couple a model-based filter, driven by the inertial measurements, with a learning-based module that has access to the thrust measurements. We show that our inertial odometry algorithm is superior to the state-of-the-art filter-based and optimization-based visual-inertial odometry as well as the state-of-the-art learned-inertial odometry in estimating the pose of an autonomous racing drone. Additionally, we show that our system is comparable to a visual-inertial odometry solution that uses a camera and exploits the known gate location and appearance. We believe that the application in autonomous drone racing paves the way for novel research in inertial odometry for agile quadrotor flight.
|
| |
| 15:06-15:12, Paper MoBT11.12 | Add to My Program |
| Nonlinear Deterministic Observer for Inertial Navigation Using Ultra-Wideband and IMU Sensor Fusion |
|
| Hashim, Hashim A. | Carleton University |
| E. E. Eltoukhy, Abdelrahman | The Hong Kong Polytechnic University |
| Vamvoudakis, Kyriakos G. | Georgia Inst. of Tech |
| Abouheaf, Mohammed | University of Ottawa |
Keywords: Aerial Systems: Perception and Autonomy, SLAM, Optimization and Optimal Control
Abstract: Navigation in Global Positioning Systems (GPS)-denied environments requires robust estimators reliant on fusion of inertial sensors able to estimate rigid-body's orientation, position, and linear velocity. Ultra-wideband (UWB) and Inertial Measurement Unit (IMU) represent low-cost measurement technology that can be utilized for successful Inertial Navigation. This paper presents a nonlinear deterministic navigation observer in a continuous form that directly employs UWB and IMU measurements. The estimator is developed on the extended Special Euclidean Group mathbb{SE}_{2}left(3right) and ensures exponential convergence of the closed loop error signals starting from almost any initial condition. The discrete version of the proposed observer is tested using a publicly available real-world dataset of a drone flight.
|
| |
| 15:12-15:18, Paper MoBT11.13 | Add to My Program |
| Precision Post-Stall Landing Using NMPC with Learned Aerodynamics |
|
| Basescu, Max | Johns Hopkins University Applied Physics Lab |
| Yeh, Bryanna | The Johns Hopkins University Applied Physics Laboratory |
| Scheuer, Luca | Johns Hopkins University Applied Physics Lab |
| Wolfe, Kevin | Johns Hopkins University Applied Physics Laboratory |
| Moore, Joseph | Johns Hopkins University Applied Physics Lab |
Keywords: Aerial Systems: Perception and Autonomy, Field Robots, Aerial Systems: Applications
Abstract: In this paper, we present an approach for achieving precision post-stall landings with medium-sized Group 1 Unmanned Aerial Systems (UAS). To do this, we employ an aggressive dive-and-stall maneuver to significantly reduce landing distance, time, and touchdown speed. Our ultimate approach relies on a nonlinear model predictive control (NMPC) algorithm and learned aerodynamic coefficients to achieve accuracy and reliability in the presence of wind disturbances. We demonstrate our approach in hardware with a 60-inch wingspan, 4.2 kg fixed- wing UAS, and show the ability to land with low speed and high accuracy using minimal throttle.
|
| |
| 15:18-15:24, Paper MoBT11.14 | Add to My Program |
| Cascaded Denoising Transformer for UAV Nighttime Tracking |
|
| Lu, Kunhan | Tongji University |
| Fu, Changhong | Tongji University |
| Wang, Yucheng | Tongji University |
| Zuo, Haobo | Tongji University |
| Zheng, Guangze | Tongji University |
| Pan, Jia | University of Hong Kong |
Keywords: Aerial Systems: Perception and Autonomy, Aerial Systems: Applications, Deep Learning for Visual Perception
Abstract: The automation of unmanned aerial vehicle (UAV) has been greatly promoted by visual object tracking methods with onboard cameras. However, the random and complicated noise produced by the cameras seriously hinders the performance of state-of-the-art (SOTA) UAV trackers, especially in low-illumination environments. To address this issue, this work proposes an efficient plug-and-play cascaded denoising Transformer (CDT) to suppress cluttered and complex noise, thereby boosting UAV tracking performance. Specifically, the novel U-shaped cascaded denoising network is designed with a streamlined structure for efficient computation. Additionally, shallow feature deepening (SFD) encoder and multi-feature collaboration (MFC) decoder are constructed based on multi-head transposed self-attention (MTSA) and multi-head transposed cross-attention (MTCA), respectively. A nested residual feed-forward network (NRFN) is developed to focus more on high-frequency information represented by noise. Extensive evaluation and test experiments demonstrate that the proposed CDT has a remarkable denoising effect and improves UAV nighttime tracking performance. The source code, pre-trained models, and experimental results are available at https://github.com/vision4robotics/CDT.
|
| |
| MoBT12 Regular session, 252AB |
Add to My Program |
| Perception for Grasping and Manipulation II |
|
| |
| Chair: Hanai, Ryo | National Institute of Industrial Science and Technology(AIST) |
| Co-Chair: Culbertson, Heather | University of Southern California |
| |
| 14:00-14:06, Paper MoBT12.1 | Add to My Program |
| Model-Free Grasping with Multi-Suction Cup Grippers for Robotic Bin Picking |
|
| Schillinger, Philipp | Bosch Center for Artificial Intelligence |
| Gabriel, Miroslav | Bosch Center for Artificial Intelligence |
| Kuss, Alexander | Robert Bosch GmbH, Corporate Sector Research and Advance Enginee |
| Ziesche, Hanna | Bosch BCAI |
| Anh Vien, Ngo | Bosch GmbH |
Keywords: Perception for Grasping and Manipulation, Computer Vision for Automation, Industrial Robots
Abstract: This paper presents a novel method for model-free prediction of grasp poses for suction grippers with multiple suction cups. Our approach is agnostic to the design of the gripper and does not require gripper-specific training data. In particular, we propose a two-step approach, where first, a neural network predicts pixel-wise grasp quality for an input image to indicate areas that are generally graspable. Second, an optimization step determines the optimal gripper selection and corresponding grasp poses based on configured gripper layouts and activation schemes. In addition, we introduce a method for automated labeling for supervised training of the grasp quality network. Experimental evaluations on a real-world industrial application with bin picking scenes of varying difficulty demonstrate the effectiveness of our method.
|
| |
| 14:06-14:12, Paper MoBT12.2 | Add to My Program |
| Vision-Based State and Pose Estimation for Robotic Bin Picking of Cables |
|
| Monguzzi, Andrea | Politecnico Di Milano |
| Cella, Christian | Politecnico Di Milano |
| Zanchettin, Andrea Maria | Politecnico Di Milano |
| Rocco, Paolo | Politecnico Di Milano |
Keywords: Perception for Grasping and Manipulation, Dual Arm Manipulation, Industrial Robots
Abstract: This paper deals with the challenging task of picking semi-deformable linear objects (SDLOs) from a bin. SDLOs are deformable elements, such as cables, joined to a rigid part as a connector. We propose a vision-based strategy to detect, classify and estimate the pose and the state (free or occluded) of connectors belonging to an unspecified number of SDLOs, arranged in an unknown configuration in the bin. The connectors can then be grasped and manipulated by a dual-arm robot through a set of manipulation primitives. In this way, a single SDLO can be extracted from the bin and laid on the worktable. A subsequent association between the connectors and the extracted SDLOs is performed, allowing to firmly grasp a SDLO at its ends to further manipulate it. The procedure is tested in bin picking operations with several kinds of SDLOs and is applied to a use case involving a collaborative wire harnesses assembly task.
|
| |
| 14:12-14:18, Paper MoBT12.3 | Add to My Program |
| Efficient Visuo-Haptic Object Shape Completion for Robot Manipulation |
|
| Rustler, Lukas | Ceske Vysoke Uceni Technicke V Praze, FEL |
| Matas, Jiri | Czech Technical University |
| Hoffmann, Matej | Czech Technical University in Prague, Faculty of Electrical Engi |
Keywords: Perception for Grasping and Manipulation, Force and Tactile Sensing, RGB-D Perception
Abstract: For robot manipulation, a complete and accurate object shape is desirable. Here, we present a method that combines visual and haptic reconstruction in a closed-loop pipeline. From an initial viewpoint, the object shape is reconstructed using an implicit surface deep neural network. The location with highest uncertainty is selected for haptic exploration, the object is touched, the new information from touch and a new point cloud from the camera are added, object position is re-estimated and the cycle is repeated. We extend Rustler et al. (2022) by using a new theoretically grounded method to determine the points with highest uncertainty, and we increase the yield of every haptic exploration by adding not only the contact points to the point cloud but also incorporating the empty space established through the robot movement to the object. Additionally, the solution is compact in that the jaws of a closed two-finger gripper are directly used for exploration. The object position is re-estimated after every robot action and multiple objects can be present simultaneously on the table. We achieve a steady improvement with every touch using three different metrics and demonstrate the utility of the better shape reconstruction in grasping experiments on the real robot. On average, grasp success rate increases from 63.3% to 70.4% after a single exploratory touch and to 82.7% after five touches. The collected data are publicly available at https://osf.io/j6rkd/ and code at https://github.com/ctu-vras/vishac.
|
| |
| 14:18-14:24, Paper MoBT12.4 | Add to My Program |
| Force Map: Learning to Predict Contact Force Distribution from Vision |
|
| Hanai, Ryo | National Institute of Industrial Science and Technology(AIST) |
| Domae, Yukiyasu | The National Institute of Advanced Industrial Science and Techno |
| Ramirez-Alpizar, Ixchel Georgina | National Institute of Advanced Industrial Science and Technology |
| Leme, Bruno | University of Florida |
| Ogata, Tetsuya | Waseda University |
Keywords: Perception for Grasping and Manipulation, Visual Learning, Force and Tactile Sensing
Abstract: When humans see a scene, they can roughly imagine the forces applied to objects based on their experience and use them to handle the objects properly. This paper considers transferring this �force-visualization� ability to robots. We hypothesize that a rough force distribution (named �force map�) can be utilized for object manipulation strategies even if accurate force estimation is impossible. Based on this hypothesis, we propose a training method to predict the force map from vision. To investigate this hypothesis, we generated scenes where objects were stacked in bulk through simulation and trained a model to predict the contact force from a single image. We further applied domain randomization to make the trained model function on real images. The experimental results showed that the model trained using only synthetic images could predict approximate patterns representing the contact areas of the objects even for real images. Then, we designed a simple algorithm to plan a lifting direction using the predicted force distribution. We confirmed that using the predicted force distribution contributes to finding natural lifting directions for typical real-world scenes. Furthermore, the evaluation through simulations showed that the disturbance caused to surrounding objects was reduced by 26 % (translation displacement) and by 39 % (angular displacement) for scenes where objects were overlapping.
|
| |
| 14:24-14:30, Paper MoBT12.5 | Add to My Program |
| Push to Know! - Visuo-Tactile Based Active Object Parameter Inference with Dual Differentiable Filtering |
|
| Dutta, Anirvan | BMW Group and Imperial College London |
| Burdet, Etienne | Imperial College London |
| Kaboli, Mohsen | BMW Group |
Keywords: Perception for Grasping and Manipulation, Force and Tactile Sensing
Abstract: For robotic systems to interact with objects in dynamic environments, it is essential to perceive the physical properties of the objects such as shape, friction coefficient, mass, center of mass, and inertia. This not only eases selecting manipulation action but also ensures the task is performed as desired. However, estimating the physical properties of especially novel objects is a challenging problem, using either vision or tactile sensing. In this work, we propose a novel framework to estimate key object parameters using non-prehensile manipulation using vision and tactile sensing. Our proposed active dual differentiable filtering (ADDF) approach as part of our framework learns the object-robot interaction during non-prehensile object push to infer the object's parameters. Our proposed method enables the robotic system to employ vision and tactile information to interactively explore a novel object via non-prehensile object push. The novel proposed N-step active formulation within the differentiable filtering facilitates efficient learning of the object-robot interaction model and during inference by selecting the next best exploratory push actions (where to push? and how to push?). We extensively evaluated our framework in simulation and real-robotic scenarios, yielding superior performance to the state-of-the-art baseline.
|
| |
| 14:30-14:36, Paper MoBT12.6 | Add to My Program |
| IOSG: Image-Driven Object Searching and Grasping |
|
| Yu, Houjian | University of Minnesota, Twin Cities |
| Lou, Xibai | University of Minnesota Twin Cities |
| Yang, Yang | University of Minnesota |
| Choi, Changhyun | University of Minnesota, Twin Cities |
Keywords: Perception-Action Coupling, Perception for Grasping and Manipulation, Deep Learning in Grasping and Manipulation
Abstract: When robots retrieve specific objects from cluttered scenes, such as home and warehouse environments, the target objects are often partially occluded or completely hidden. Robots are thus required to search, identify a target object, and successfully grasp it. Preceding works have relied on pre-trained object recognition or segmentation models to find the target object. However, such methods require laborious manual annotations to train the models and even fail to find novel target objects. In this paper, we propose an Image-driven Object Searching and Grasping (IOSG) approach where a robot is provided with the reference image of a novel target object and tasked to find and retrieve it. We design a Target Similarity Network that generates a probability map to infer the location of the novel target. IOSG learns a hierarchical policy; the high-level policy predicts the subtask type, whereas the low-level policies, explorer and coordinator, generate effective push and grasp actions. The explorer is responsible for searching the target object when it is hidden or occluded by other objects. Once the target object is found, the coordinator conducts target-oriented pushing and grasping to retrieve the target from the clutter. The proposed pipeline is trained with full self-supervision in simulation and applied to a real environment. Our model achieves a 96.0% and 94.5% task success rate on coordination and exploration tasks in simulation respectively, and 85.0% success rate on a real robot for the search-and-grasp task. Please refer to our project page for more information: https://z.umn.edu/iosg.
|
| |
| 14:36-14:42, Paper MoBT12.7 | Add to My Program |
| DexRepNet: Learning Dexterous Robotic Grasping Network with Geometric and Spatial Hand-Object Representation |
|
| Qingtao, Liu | Zhejiang University |
| Cui, Yu | Zhejiang University |
| Ye, Qi | Zhejiang University |
| Sun, Zhengnan | Zhejiang University |
| Li, Haoming | Zhejiang University |
| Li, Gaofeng | Zhejiang University |
| Shao, Lin | National University of Singapore |
| Chen, Jiming | Zhejiang University |
Keywords: Perception for Grasping and Manipulation, Grasping, Multifingered Hands
Abstract: Robotic dexterous grasping is a challenging problem due to the high degree of freedom (DoF) and complex contacts of multi-fingered robotic hands. Existing deep reinforcement learning (DRL) based methods leverage human demonstrations to reduce sample complexity due to the high dimensional action space with dexterous grasping. However, less attention has been paid to hand-object interaction representations for high-level generalization. In this paper, we propose a novel geometric and spatial hand-object interaction representation, named DexRep, to capture object surface features and the spatial relations between hands and objects during grasping. DexRep comprises Occupancy Feature for rough shapes within sensing range by moving hands, Surface Feature for changing hand-object surface distances, and Local-Geo Feature for local geometric surface features most related to potential contacts. Based on the new representation, we propose a dexterous deep reinforcement learning method DexRepNet to learn a generalizable grasping policy. Experimental results show that our method outperforms baselines using existing representations for robotic grasping dramatically both in grasp success rate and convergence speed. It achieves a 93% grasping success rate on seen objects and higher than 80% grasping success rates on diverse objects of unseen categories in both simulation and real-world experiments.
|
| |
| 14:42-14:48, Paper MoBT12.8 | Add to My Program |
| Active Acoustic Sensing for Robot Manipulation |
|
| Lu, Shihan | University of Southern California |
| Culbertson, Heather | University of Southern California |
Keywords: Perception for Grasping and Manipulation, Force and Tactile Sensing, Grasping
Abstract: Perception in robot manipulation has been actively explored with the goal of advancing and integrating vision and touch for global and local feature extraction. However, it is difficult to perceive certain object internal states, and the integration of visual and haptic perception is not compact and is easily biased. We propose to address these limitations by developing an active acoustic sensing method for robot manipulation. Active acoustic sensing relies on the resonant properties of the object, which are related to its material, shape, internal structure, and contact interactions with the gripper and environment. The sensor consists of a vibration actuator paired with a piezo-electric microphone. The actuator generates a waveform, and the microphone tracks the waveform's propagation and distortion as it travels through the object. This paper presents the sensing principles, hardware design, simulation development, and evaluation of physical and simulated sensory data under different conditions as a proof-of-concept. This work aims to provide fundamentals on a useful tool for downstream robot manipulation tasks using active acoustic sensing, such as object recognition, grasping point estimation, object pose estimation, and external contact formation detection.
|
| |
| 14:48-14:54, Paper MoBT12.9 | Add to My Program |
| Grasp Region Exploration for 7-DoF Robotic Grasping in Cluttered Scenes |
|
| Chen, Zibo | Sun Yat-Sen University |
| Liu, Zhixuan | Sun Yat-Sen University |
| Xie, Shangjin | Sun Yat-Sen University |
| Zheng, Wei-Shi | Sun Yat-Sen University |
Keywords: Perception for Grasping and Manipulation
Abstract: Robotic grasping is a fundamental skill for robots, but it is quite challenging in cluttered scenes. In cluttered scenes, the precise prediction of high-quality grasp configurations such as rotation and grasping width while avoiding collisions is essential. To accomplish this, the grasp detection models require the capabilities of stronger fine-grained information extracted around the grasp points. However, due to the computational resource restriction, point clouds are usually downsampled in existing networks, which inevitably make some potentially important points discarded. To overcome this problem, we propose a Grasp Region Exploration module to explore the area covered by high-quality grasps. Based on the grasp region, we enhance the point density around the grasp points to mitigate the loss of information caused by downsampling. Furthermore, we devise the Grasp Region Attention module to dynamically aggregate features of various points within the grasp region, such as the grasp point and contact points. The proposed method achieves state-of-the-art performance on the large-scale GraspNet-1Billion dataset. We also conduct real-world experiments on a Franka Emika Panda robot and show that the robot can grasp objects in cluttered scenes with a high success rate.
|
| |
| 14:54-15:00, Paper MoBT12.10 | Add to My Program |
| Bagging by Learning to Singulate Layers Using Interactive Perception |
|
| Chen, Lawrence Yunliang | UC Berkeley |
| Shi, Baiyu | UC Berkeley |
| Lin, Roy | University of California, Berkeley |
| Seita, Daniel | Carnegie Mellon University |
| Ahmad, Ayah | University of California, Berkeley |
| Cheng, Richard | California Institute of Technology |
| Kollar, Thomas | Toyota Research Institute |
| Held, David | Carnegie Mellon University |
| Goldberg, Ken | UC Berkeley |
Keywords: Perception for Grasping and Manipulation, Bimanual Manipulation, Deep Learning in Grasping and Manipulation
Abstract: Many fabric handling and 2D deformable material tasks in homes and industry require singulating layers of material such as opening a bag or arranging garments for sewing. In contrast to methods requiring specialized sensing or end effectors, we use only visual observations with ordinary parallel jaw grippers. We propose SLIP: Singulating Layers using Interactive Perception, and apply SLIP to the task of autonomous bagging. We develop SLIP-Bagging, a bagging algorithm that manipulates a plastic or fabric bag from an unstructured state, and uses SLIP to grasp the top layer of the bag to open it for object insertion. In physical experiments, a YuMi robot achieves a success rate of 67% to 81% across bags of a variety of materials, shapes, and sizes, significantly improving in success rate and generality over prior work. Experiments also suggest that SLIP can be applied to tasks such as singulating layers of folded cloth and garments. Supplementary material is available at https://sites.google.com/view/slip-bagging/.
|
| |
| 15:00-15:06, Paper MoBT12.11 | Add to My Program |
| Simultaneous Multi-Object 3D Shape Reconstruction, 6DoF Pose Estimation and Dense Grasp Prediction |
|
| Agrawal, Shubham | Samsung Research America |
| Chavan-Dafle, Nikhil | Samsung Research America |
| Kasahara, Isaac | Samsung Research America |
| Engin, Kazim Selim | University of Minnesota |
| Huh, Jinwook | Samsung |
| Isler, Volkan | University of Minnesota |
Keywords: Perception for Grasping and Manipulation, Deep Learning in Grasping and Manipulation, Grasping
Abstract: In this paper, we present a real-time method for simultaneous object-level scene understanding and grasp prediction. Specifically, given a single RGBD image of a scene, our method localizes all the objects in the scene and for each object, it generates the following: full 3D shape, scale, pose with respect to the camera frame, and a dense set of feasible grasps. The main advantage of our method is its computation speed as it avoids sequential perception and grasp planning. With detailed quantitative analysis of reconstruction quality and grasp accuracy, we show that our method delivers competitive performance compared to the state-of-the-art methods, while providing fast inference at 30 frames per second speed.
|
| |
| 15:06-15:12, Paper MoBT12.12 | Add to My Program |
| Flexible Handover with Real-Time Robust Dynamic Grasp Trajectory Generation |
|
| Zhang, Gu | Shanghai Jiaotong University |
| Fang, Hao-Shu | Shanghai Jiao Tong University |
| Fang, Hongjie | Shanghai Jiao Tong University |
| Lu, Cewu | ShangHai Jiao Tong University |
Keywords: Perception for Grasping and Manipulation, Human-Robot Collaboration, Grasping
Abstract: In recent years, there has been a significant effort dedicated to developing efficient, robust, and general human-to-robot handover systems. However, the area of flexible handover in the context of complex and continuous objects' motion remains relatively unexplored. In this work, we propose an approach for effective and robust flexible handover, which enables the robot to grasp moving objects with flexible motion trajectories with a high success rate. The key innovation of our approach is the generation of real-time robust grasp trajectories. We also design a future grasp prediction algorithm to enhance the system's adaptability to dynamic handover scenes. We conduct one-motion handover experiments and motion-continuous handover experiments on our novel benchmark that includes 31 diverse household objects. The system we have developed allows users to move and rotate objects in their hands within a relatively large range. The success rate of the robot grasping such moving objects is 78.15% over the entire household object benchmark.
|
| |
| MoBT13 Regular session, 260 Portside Ballroom |
Add to My Program |
| Computer Vision for Automation |
|
| |
| Chair: Wang, Chen | State University of New York at Buffalo |
| Co-Chair: Kantor, George | Carnegie Mellon University |
| |
| 14:00-14:06, Paper MoBT13.1 | Add to My Program |
| NeurAR: Neural Uncertainty for Autonomous 3D Reconstruction with Implicit Neural Representations |
|
| Ran, Yunlong | Zhejiang University |
| Zeng, Jing | Zhejiang University |
| He, Shibo | Zhejiang University |
| Chen, Jiming | Zhejiang University |
| Li, Lincheng | NetEase Fuxi AI Lab |
| Chen, Yingfeng | Netease Inc |
| Lee, Gim Hee | National University of Singapore |
| Ye, Qi | Zhejiang University |
Keywords: Computer Vision for Automation, Motion and Path Planning, Planning under Uncertainty
Abstract: Implicit neural representations have shown compelling results in offline 3D reconstruction and also recently demonstrated the potential for online SLAM systems. However, applying them to autonomous 3D reconstruction, where a robot is required to explore a scene and plan a view path for the reconstruction, has not been studied. In this paper, we explore for the first time the possibility of using implicit neural representations for autonomous 3D scene reconstruction by addressing two key challenges: 1) seeking a criterion to measure the quality of the candidate viewpoints for the view planning based on the new representations, and 2) learning the criterion from data that can generalize to different scenes instead of a hand-crafting one. To solve the challenges, firstly, a proxy of Peak Signal-to-Noise Ratio (PSNR) is proposed to quantify a viewpoint quality; secondly, the proxy is optimized jointly with the parameters of an implicit neural network for the scene. With the proposed view quality criterion from neural networks (termed as Neural Uncertainty), we can then apply implicit representations to autonomous 3D reconstruction. Our method demonstrates significant improvements on various metrics for the rendered image quality and the geometry quality of the reconstructed 3D models when compared with variants using TSDF or reconstruction without view planning.
|
| |
| 14:06-14:12, Paper MoBT13.2 | Add to My Program |
| HyperTraj: Towards Simple and Fast Scene-Compliant Endpoint Conditioned Trajectory Prediction |
|
| Huang, Renhao | University of New South Wales |
| Pagnucco, Maurice | University of New South Wales |
| Song, Yang | University of New South Wales |
Keywords: Computer Vision for Automation, Vision-Based Navigation, Intention Recognition
Abstract: An important task in trajectory prediction is to model the uncertainty of agents' motions, which requires the system to propose multiple plausible future trajectories for agents based on their past movements. Recently, many approaches have been developed following an endpoint-conditioned deep learning framework by firstly predicting the distribution of endpoints, then sampling endpoints from it and finally completing their waypoints. However, this framework suffers a severe efficiency issue as it needs to repeatedly execute a separate decoder conditioned on multiple sampled endpoints. In this work, we propose a simple and fast endpoint conditioned fully convolutional trajectory prediction framework, called HyperTraj, by using dynamic convolutions to generate multiple trajectories, with the main benefits that (1) our prediction is conditioned on endpoint but takes almost constant time when the number of goals increases and (2) our model benefits from convolutional based predictions, such as the acceptance of various scene sizes and better modeling of agent-scene interactions. In our experiment, our model shows comparable or even better accuracy than our state-of-the-art baselines on SDD and VIRAT datasets with around 84% of acceleration and 90% model weight reduction for waypoint decoding.
|
| |
| 14:12-14:18, Paper MoBT13.3 | Add to My Program |
| PanelPose: A 6D Pose Estimation of Highly-Variable Panel Object for Robotic Robust Cockpit Panel Inspection |
|
| Sun, Han | Shanghai Jiao Tong UNIVERSITY |
| Ni, Peiyuan | National University of Singapore |
| Li, Zhiqi | Shanghai Jiao Tong UNIVERSITY |
| Wang, Yizhao | SJTU |
| Zhu, Xiaoxiao | SJTU |
| Cao, Qixin | Shanghai Jiao Tong University |
Keywords: Computer Vision for Automation, Industrial Robots, Recognition
Abstract: In robotic cockpit inspection scenarios, the 6D pose of highly-variable panel objects is necessary. However, the buttons with different states on the panel cause the variable texture and point cloud, which confuses the traditional invariable object pose estimation method. The bottleneck is the variable texture and point cloud. To address this issue, we propose a simple yet effective method denoted as PanelPose that leverages synthetic data and edge-line features. Specifically, we extract edge and line features of RGB images and fuse these feature maps as a multi-feature fusion map (MFF Map) to focus on the shape features of panel objects. Moreover, we design an effective keypoint selection algorithm considering the shape information of panel objects, which simplifies keypoint localization for precise pose estimation. Finally, the panel object pose is estimated via PNP/RANSAC, refined by the multistate template (MST) and multi-scale ICP. We experimentally show that state-of-the-art 6D pose estimation methods alone are not sufficient to solve the cockpit panel inspection task but that our method significantly improves the performance. In cockpit inspection scenarios, the panel localization error is less than 3mm using our method. Code and data are available at https://github.com/sunhan1997/PanelPose.
|
| |
| 14:18-14:24, Paper MoBT13.4 | Add to My Program |
| Image Restoration Via UAVFormer for Under-Display Camera of UAV |
|
| Zheng, Zhuoran | Nanjing University of Science and Technology |
| Jia, Xiuyi | Nanjing University of Science and Technology |
Keywords: Computer Vision for Automation, Computer Vision for Manufacturing, Computer Vision for Transportation
Abstract: The exposed cameras of UAVs can shake, shift, or even malfunction under the influence of harsh weather, while the add-on devices (Dupont lines) are very vulnerable to damage. Although we can place a low-cost transparent film overlay around the camera to protect it, this would also introduce image degradation issues (such as oversaturation, astigmatism, etc). To tackle the image degradation problem caused by overlaying transparent film, in this paper we propose a novel method to enhance the visual experience by adapting a deep network with UAV characteristics. Specifically, we first develop a stabilizer to filter the input images which avoids blurred imaging due to the shaking of the drone hardware. Then, we propose a customized Transformer named UAVFormer recover the image, which has a key module at each stage based on the Swin Transformer with local awareness (LAT). Finally, we use an evidential fusion algorithm to integrate the generated images at each stage to obtain a high-quality result. Furthermore, we create a high-resolution under-display camera dataset to support the training and testing of compared models. Our model can conduct high-quality recovery of images of 2K resolution on some embedded devices (Raspberry Pi 4b) in real time.
|
| |
| 14:24-14:30, Paper MoBT13.5 | Add to My Program |
| Semantic Scene Difference Detection in Daily Life Patroling by Mobile Robots Using Pre-Trained Large-Scale Vision-Language Model |
|
| Obinata, Yoshiki | The University of Tokyo |
| Kawaharazuka, Kento | The University of Tokyo |
| Kanazawa, Naoaki | The University of Tokyo |
| Yamaguchi, Naoya | The University of Tokyo |
| Tsukamoto, Naoto | The University of Tokyo |
| Yanokura, Iori | University of Tokyo |
| Kitagawa, Shingo | The University of Tokyo |
| Shinjo, Koki | The University of Tokyo |
| Okada, Kei | The University of Tokyo |
| Inaba, Masayuki | The University of Tokyo |
Keywords: Environment Monitoring and Management, Computer Vision for Automation, Recognition
Abstract: It is important for daily life support robots to detect changes in their environment and perform tasks. In the field of anomaly detection in computer vision, probabilistic and deep learning methods have been used to calculate the image distance. These methods calculate distances by focusing on image pixels. In contrast, this study aims to detect semantic changes in the daily life environment using the current development of large-scale vision-language models. Using its Visual Question Answering (VQA) model, we propose a method to detect semantic changes by applying multiple questions to a reference image and a current image and obtaining answers in the form of sentences. Unlike deep learning-based methods in anomaly detection, this method does not require any training or fine-tuning, is not affected by noise, and is sensitive to semantic state changes in the real world. In our experiments, we demonstrated the effectiveness of this method by applying it to a patrol task in a real-life environment using a mobile robot, Fetch Mobile Manipulator. In the future, it may be possible to add explanatory power to changes in the daily life environment through spoken language.
|
| |
| 14:30-14:36, Paper MoBT13.6 | Add to My Program |
| Seeing the Fruit for the Leaves: Robotically Mapping Apple Fruitlets in a Commercial Orchard |
|
| Qureshi, Ans | University of Auckland |
| Smith, David | University of Auckland |
| Gee, Trevor | The University of Auckland |
| Nejati, Mahla | The University of Auckland |
| Shahabi, Jalil | University of Auckland |
| Lim, JongYoon | University of Auckland |
| Ahn, Ho Seok | The University of Auckland, Auckland |
| McGuinness, Benjamin John | University of Waikato |
| Downes, Catherine | University of Waikato |
| Jangali, Rahul | The University of Waikato |
| Black, Kale | Black Box Technologies LTD |
| Lim, Shen Hin | University of Waikato |
| Duke, Mike | Waikato University |
| MacDonald, Bruce | University of Auckland |
| Williams, Henry | University of Auckland |
Keywords: Robotics and Automation in Agriculture and Forestry, Computer Vision for Automation, Agricultural Automation
Abstract: Aotearoa New Zealand has a strong and growing apple industry but struggles to access workers to complete skilled, seasonal tasks such as thinning. To ensure effective thinning and make informed decisions on a per-tree basis, it is crucial to accurately measure the crop load of individual apple trees. However, this task poses challenges due to the dense foliage that hides the fruitlets within the tree structure. In this paper, we introduce the vision system of an automated apple fruitlet thinning robot, developed to tackle the labor shortage issue. This paper presents the initial design, implementation, and evaluation specifics of the system. The platform straddles the 3.4 m tall 2D apple canopy structures to create an accurate map of the fruitlets on each tree. We show that this platform can measure the fruitlet load on an apple tree by scanning through both sides of the branch. The requirement of an overarching platform was justified since two-sided scans had a higher counting accuracy of 81.17 % than one-sided scans at 73.7 %. The system was also demonstrated to produce size estimates within 5.9% RMSE of their true size.
|
| |
| 14:36-14:42, Paper MoBT13.7 | Add to My Program |
| Cross-Domain Autonomous Driving Perception Using Contrastive Appearance Adaptation |
|
| Zheng, Ziqiang | Hong Kong University of Science and Technology |
| Chen, Yingshu | HKUST |
| Hua, Binh-Son | VinAI |
| Wu, Yang | Tencent |
| Yeung, Sai-Kit | Hong Kong University of Science and Technology |
Keywords: Computer Vision for Automation, Object Detection, Segmentation and Categorization, Autonomous Vehicle Navigation
Abstract: Addressing domain shifts for complex perception tasks in autonomous driving has long been a challenging problem. In this paper, we show that existing domain adaptation methods pay little attention to the textit{content mismatch} issue between source and target domains, thus weakening the domain adaptation performance and the decoupling of domain-invariant and domain-specific representations. To solve the aforementioned problems, we propose an image-level domain adaptation framework that aims at adapting source-domain images to the target domain with content-aligned source-target image pairs. Our framework consists of three mutually beneficial modules in a cycle: a textit{cross-domain content alignment} module to generate source-target pairs with consistent content representations in a self-supervised manner, textit{a reference-guided image synthesis} based on the generated content-aligned source-target image pairs, and a textit{contrastive learning} module to self-supervise domain-invariant feature extractor. Our contrastive appearance adaptation is task-agnostic and robust to complex perception tasks in autonomous driving. Our proposed method demonstrates state-of-the-art results in cross-domain object detection, semantic segmentation, and depth estimation as well as better image synthesis ability qualitatively and quantitatively.
|
| |
| 14:42-14:48, Paper MoBT13.8 | Add to My Program |
| MENTOR: Multilingual tExt detectioN TOward leaRning by Analogy |
|
| Lin, Hsin-Ju | National Yang Ming Chiao Tung University |
| Chung, Tsu-Chun | National Yang Ming Chiao Tung University |
| Hsiao, Ching-chun | National Yang Ming Chiao Tung University |
| Chen, Pin-Yu | IBM Research |
| Chiu, Wei-Chen | National Chiao Tung University |
| Huang, Ching-Chun | National Chiao Tung University |
Keywords: Computer Vision for Automation, Recognition, Semantic Scene Understanding
Abstract: Text detection is frequently used in vision-based mobile robots when they need to interpret texts in their surroundings to perform a given task. For instance, delivery robots in multilingual cities need to be capable of doing multilingual text detection so that the robots can read traffic signs and road markings. Moreover, the target languages change from region to region, implying the need of efficiently re-training the models to recognize the novel/new languages. However, collecting and labeling training data for novel languages are cumbersome, and the efforts to re-train an existing/trained text detector are considerable. Even worse, such a routine would repeat whenever a novel language appears. This motivates us to propose a new problem setting for tackling the aforementioned challenges in a more efficient way: ``We ask for a generalizable multilingual text detection framework to detect and identify seen and unseen language regions inside scene images without the requirement of collecting supervised training data for unseen languages as well as model re-training''. To this end, we propose ``MENTOR'', the first work to realize a learning strategy between zero-shot learning and few-shot learning for multilingual scene text detection. During the training phase, we leverage the ``zero-cost'' synthesized printed texts and the available training/seen languages to learn the meta-mapping from printed texts to language-specific kernel weights. Meanwhile, dynamic convolution networks guided by the language-specific kernel are trained to realize a detection-by-feature-matching scheme. In the inference phase, ``zero-cost'' printed texts are synthesized given a new target language. By utilizing the learned meta-mapping and the matching network, our ``MENTOR'' can freely identify the text regions of the new language. Experiments show our model can achieve comparable results with supervised methods for seen languages and outperform other methods in detecting unseen languages.
|
| |
| 14:48-14:54, Paper MoBT13.9 | Add to My Program |
| Towards a Robust Adversarial Patch Attack against Unmanned Aerial Vehicles Object Detection |
|
| Shrestha, Samridha | Technology Innovation Institute |
| Pathak, Saurabh | Technology Innovation Institute |
| Viegas, Eduardo | Pontif�cia Universidade Catolica Do Paran� (PUCPR), Brazil |
Keywords: Computer Vision for Automation, Deep Learning Methods
Abstract: Object detection techniques for autonomous Unmanned Aerial Vehicles (UAV) are built upon Deep Neural Networks (DNN), which are known to be vulnerable to adversarial patch perturbation attacks that lead to object detection evasion. Yet, current adversarial patch generation schemes are not designed for UAV imagery settings. This paper proposes a new robust adversarial patch generation attack against object detection with UAVs. We build adversarial patches considering UAV-specific settings such as the UAV camera perspective, viewing angle, distance, and brightness changes. As a result, built patches can also degrade the accuracy of object detector models implemented with different initializations and architectures. Experiments conducted on the VisDrone dataset have shown the proposal�s feasibility, achieving an attack success rate of up to 80% in a white-box setting. In addition, we also transfer the patch against DNN models with different initializations and different architectures, reaching attack success rates of up to 75% and 78%, respectively, in a gray-box setting.
|
| |
| 14:54-15:00, Paper MoBT13.10 | Add to My Program |
| Fast Point to Mesh Distance by Domain Voxelization |
|
| Gutow, Geordan | Carnegie Mellon University |
| Choset, Howie | Carnegie Mellon University |
Keywords: Computational Geometry, RGB-D Perception, Computer Vision for Automation
Abstract: Computing the distance from a point to a triangle mesh is a key computational step in robotics pipelines such as registration and collision detection, with applications to path planning, SLAM, and RGB-D vision. Numerous techniques to accelerate this computation have been developed, many of which use a cheap pre-processing step to construct a hierarchical decomposition of the mesh. If the mesh is fixed and known ahead of time, there is an opportunity to conduct more expensive pre-computations to accelerate the subsequent distance queries. This work presents a voxelization approach, implemented on both CPU and GPU, to compute point to mesh distance that constructs for each voxel a near-minimal set of triangles that is guaranteed to include every triangle that is closest to at least one point in the voxel. Theoretical and numerical comparisons with six alternative distance algorithms demonstrate the speed advantages of the proposed method.
|
| |
| 15:00-15:06, Paper MoBT13.11 | Add to My Program |
| AirLine: Efficient Learnable Line Detection with Local Edge Voting |
|
| Lin, Xiao | Georgia Institute of Technology |
| Wang, Chen | State University of New York at Buffalo |
Keywords: Computer Vision for Automation, SLAM
Abstract: Line detection is widely used in many robotic tasks such as scene recognition, 3D reconstruction, and simultaneous localization and mapping (SLAM). Compared to points, lines can provide both low-level and high-level geometrical information for downstream tasks. In this paper, we propose a novel learnable edge-based line detection algorithm, AirLine, which can be applied to various tasks. In contrast to existing learnable endpoint-based methods, which are sensitive to the geometrical condition of environments, AirLine can extract line segments directly from edges, resulting in a better generalization ability for unseen environments. To balance efficiency and accuracy, we introduce a region-grow algorithm and a local edge voting scheme for line parameterization. To the best of our knowledge, AirLine is one of the first learnable edge-based line detection methods. Our extensive experiments have shown that it retains state-of-the-art-level precision, yet with a 3-80x runtime acceleration compared to other learning-based methods, which is critical for low-power robots.
|
| |
| 15:06-15:12, Paper MoBT13.12 | Add to My Program |
| 3D Skeletonization of Complex Grapevines for Robotic Pruning |
|
| Schneider, Franz | Carnegie Mellon University |
| Jayanth, Sushanth | Carnegie Mellon University |
| Silwal, Abhisesh | Carnegie Mellon University |
| Kantor, George | Carnegie Mellon University |
Keywords: RGB-D Perception, Robotics and Automation in Agriculture and Forestry, Computer Vision for Automation
Abstract: Robotic pruning of dormant grapevines is an area of active research in order to promote vine balance and grape quality, but so far robotic efforts have largely focused on planar, simplified vines not representative of commercial vineyards. This paper aims to advance the robotic perception capabilities necessary for pruning in denser and more complex vine structures by extending plant skeletonization techniques. The proposed pipeline generates skeletal grapevine models that have lower reprojection error and higher connectivity than baseline algorithms. We also show how 3D and skeletal information enables prediction accuracy of pruning weight for dense vines surpassing prior work, where pruning weight is an important vine metric influencing pruning site selection.
|
| |
| 15:12-15:18, Paper MoBT13.13 | Add to My Program |
| AdaptSeqVPR: An Adaptive Sequence-Based Visual Place Recognition Pipeline |
|
| Li, Heshan | Nanyang Technological University |
| Peng, Guohao | Nanyang Technological University |
| Zhang, Jun | Nanyang Technological University |
| Vaikundam, Sriram | Continental Automotive Singapore Pte Ltd |
| Wang, Danwei | Nanyang Technological University |
Keywords: Computer Vision for Automation
Abstract: Visual Place Recognition (VPR) is essential for autonomous robots and unmanned vehicles, as an accurate identification of visited sites can trigger a closed loop to optimize the built map. The most prevalent methods tackle VPR as a single-frame retrieval task, which uses a CNN-based encoder to describe and compare each individual frame. These methods, however, overlook the temporal information between frames. Other methods improve this by searching the database with consecutive frames, which can greatly reduce false positives. Nevertheless, current sequence-based methods typically assume the image frames to be captured at a constant speed, which is not always the case in practice. Therefore, we propose an adaptive sequence search strategy (AdaptSeq), which can dynamically alter the step size of adjacent frames in the retrieved sequence trajectory. Besides, to address invalid retrieval of input frames that have no true correspondence in the database, we propose a CNN-based discriminator named DDsNet. It can determine whether the top retrieved candidates are true positives based on the learned statistics rather than an artificial threshold. Overall, we construct a novel sequence-based VPR pipeline named AdaptSeqVPR. It utilizes a CNN-based encoder for frame descriptions, and encompasses AdaptSeq and DDsNet for sequence matching. The experimental results indicate that our AdaptSeqVPR exhibits superior performance compared to the baseline SeqSLAM and SeqVLAD. Notably, our method can robustly handle the sequence-based VPR for vehicles traveling at non-uniform speeds in changing environments.
|
| |
| 15:18-15:24, Paper MoBT13.14 | Add to My Program |
| Towards Automated Void Detection for Search and Rescue with 3D Perception |
|
| Bal, Ananya | Carnegie Mellon University |
| Gupta, Ashutosh | BITS Pilani KK Birla Goa Campus |
| Goyal, Pranav | Birla Institute of Technology & Science - Pilan |
| Merrick, David | Florida State University |
| Murphy, Robin | Texas A&M |
| Choset, Howie | Carnegie Mellon University |
Keywords: Search and Rescue Robots, Aerial Systems: Perception and Autonomy, Computer Vision for Automation
Abstract: In a structural collapse, debris piles up in a chaotic and unstable manner, creating pockets and void spaces that are difficult to see or access. Often, these regions have the highest chances of concealing survivors and identifying such regions can increase the success of a search and rescue (SAR) operation while ensuring the safety of both survivors and rescue teams. In this paper, we present an approach for ex post facto void detection in rubble piles by using registered 3D point clouds reconstructed from aerial images captured at multiple times on the scene. We perform a temporal layering of these point clouds to capture the dynamic surface of the rubble pile from multiple days of the SAR operation and analyze this 3D structure to detect candidate regions corresponding to void spaces. The layering is achieved by a parallel 3D point cloud reconstruction of the scene using the COLMAP Structure from Motion pipeline. The void detection is achieved by applying multiple point filtering criteria in thin segments of the 3D point clouds of the rubble. We test our approach on aerial images collected from the Surfside Structural Collapse at Miami in June 2021. Our method achieves an improvement in registration compared to the use of standard point cloud registration methods on individual 3D reconstructions. Through our method, we see translation errors reduce by 82%. Additionally, our method detects 9 out of 10 void spaces that were observed by experts in the rubble.
|
| |
| MoBT14 Regular session, 320 |
Add to My Program |
| Localization II |
|
| |
| Chair: Barfoot, Timothy | University of Toronto |
| Co-Chair: Hausler, Stephen | CSIRO |
| |
| 14:00-14:06, Paper MoBT14.1 | Add to My Program |
| (LC)2: LiDAR-Camera Loop Constraints for Cross-Modal Place Recognition |
|
| Lee, Alex | Hyundai Motor Company |
| Song, Seungwon | Hyundai Motor Company |
| Lim, Hyungtae | Korea Advanced Institute of Science and Technology |
| Lee, Wooju | KAIST |
| Myung, Hyun | KAIST (Korea Advanced Institute of Science and Technology) |
Keywords: Localization, Sensor Fusion, Deep Learning for Visual Perception
Abstract: Localization has been a challenging task for autonomous navigation. A loop detection algorithm must overcome environmental changes for the place recognition and re-localization of robots. Therefore, deep learning has been extensively studied for the consistent transformation of measurements into localization descriptors. Street view images are easily accessible; however, images are vulnerable to appearance changes. LiDAR can robustly provide precise structural information. However, constructing a point cloud database is expensive, and point clouds exist only in limited places. Different from previous works that train networks to produce shared embedding directly between the 2D image and 3D point cloud, we transform both data into 2.5D depth images for matching. In this work, we propose a novel cross-matching method, called (LC)2, for achieving LiDAR localization without a prior point cloud map. To this end, LiDAR measurements are expressed in the form of range images before matching them to reduce the modality discrepancy. Subsequently, the network is trained to extract localization descriptors from disparity and range images. Next, the best matches are employed as a loop factor in a pose graph. Using public datasets that include multiple sessions in significantly different lighting conditions, we demonstrated that LiDAR-based navigation systems could be optimized from image databases and vice versa.
|
| |
| 14:06-14:12, Paper MoBT14.2 | Add to My Program |
| Visual Localization Based on Multiple Maps |
|
| Lin, Yukai | ETH Zurich |
| Liu, Liu | Huawei |
| Liang, Xiao | The University of Tokyo |
| Li, Jiangwei | Huawei Cloud Computing Technologies Co., Ltd |
Keywords: Localization, Vision-Based Navigation, SLAM
Abstract: This paper proposes a multi-map based visual localization method for image sequences. Given multiple single-map based localization results, we combine them with SLAM to estimate robust and accurate camera poses under challenging conditions. Our method comprises three modules connected in a sequence. First, we reconstruct multiple reference maps using the Structure from Motion technique, one map for each reference sequence. A single-image-based localization pipeline is performed to estimate 6-DoF camera poses for each query image, one for each map. Second, a consensus set maximization module is proposed to select the best camera poses from multi-map poses, estimating one 6-DoF camera pose for each query image. Finally, a robust pose refinement module is proposed to optimize 6-DoF camera poses of query images, combining map-based localization and local SLAM information. Experiments show that the proposed pipeline achieves state-of-the-art performance on challenging map-based localization benchmarks. Demonstrating the broad applicability of our method, we obtained the first place in the challenge of Map-Based Localization for Autonomous Driving at ECCV2022.
|
| |
| 14:12-14:18, Paper MoBT14.3 | Add to My Program |
| An Interacting Multiple Model Approach Based on Maximum Correntropy Student's T Filter |
|
| Candan, Fethi | The University of Sheffield |
| Beke, Aykut | Aselsan |
| Mihaylova, Lyudmila | University of Sheffield |
Keywords: Localization, Visual Tracking, Aerial Systems: Applications
Abstract: This paper presents a novel Interacting Multiple Model (IMM)-based maximum correntropy Student's T filter (MCStF). The MCStF is able to work with non-Gaussian measurement noises, and it is shown to outperform the IMM algorithm based on Kalman Filters (KFs) both in a simulation environment and on a real-time system. The Crazyflie 2.0 nano Unmanned Air Vehicle (UAV) model is used in the simulation validation, and results from 3000 independent Monte Carlo runs are shown. After getting the simulation results under monotonously changed non-Gaussian distribution, their performance results have been compared to each other. The same scenario has been applied in the real-time system using Crazyflie 2.0. Next, results from real-time tests are presented in which the position of Crazyflie 2.0 is estimated online.
|
| |
| 14:18-14:24, Paper MoBT14.4 | Add to My Program |
| Deep Robust Multi-Robot Re-Localisation in Natural Environments |
|
| Ramezani, Milad | CSIRO |
| Griffiths, Ethan | Queensland University of Technology |
| Haghighat, Maryam | Queensland University of Technology |
| Pitt, Alex | CSIRO |
| Moghadam, Peyman | CSIRO |
Keywords: Localization, Deep Learning Methods, Recognition
Abstract: The success of re-localisation has crucial implications for the practical deployment of robots operating within a prior map or relative to one another in real-world scenarios. Using single-modality, place recognition and localisation can be compromised in challenging environments such as forests. To address this, we propose a strategy to prevent lidar-based re-localisation failure using lidar-image cross-modality. Our solution relies on self-supervised 2D-3D feature matching to predict alignment and misalignment. Leveraging a deep network for lidar feature extraction and relative pose estimation between point clouds, we train a model to evaluate the estimated transformation. A model predicting the presence of misalignment is learned by analysing image-lidar similarity in the embedding space and the geometric constraints available within the region seen in both modalities in Euclidean space. Experimental results using real datasets (offline and online modes) demonstrate the effectiveness of the proposed pipeline for robust re-localisation in unstructured, natural environments.
|
| |
| 14:24-14:30, Paper MoBT14.5 | Add to My Program |
| FVLoc-NeRF: Fast Vision-Only Localization within Neural Radiation Field |
|
| Guo, Wenzhi | Nanjing University |
| Haiyang, Bai | Nanjing University |
| Mou, Yuanqu | Nanjing University |
| Liu, Jia | Nanjing University |
| Chen, Lijun | Nanjing University |
Keywords: Localization, Deep Learning Methods, SLAM
Abstract: In recent years, Neural Radiation Fields (NeRF) have shown tremendous potential in encoding highly-detailed 3D geometry and environmental appearance, thus making it a promising alternative to traditional explicit maps for robot localization. However, current NeRF localization methods suffer from significant computational overheads, primarily resulting from the large number of iterations or particle samples required, as well as the additional computational demands associated with the estimation of the initial pose through multimodal sensors. To overcome these challenges, we propose a novel and time-efficient NeRF localization pipeline, named FVLoc-NeRF. This pipeline solely employs RGB monocular images as input and leverages a retrieval method to obtain the initial pose. Subsequently, the pose update is derived using the Perspective-n-Point (PnP) algorithm, thereby considerably reducing the number of iterations and accelerating the localization process. Our extensive experimental results clearly demonstrate that FVLoc-NeRF is much faster than the state-of-the-art method.
|
| |
| 14:30-14:36, Paper MoBT14.6 | Add to My Program |
| RADA: Robust Adversarial Data Augmentation for Camera Localization in Challenging Conditions |
|
| Wang, Jialu | Oxford |
| Saputra, Muhamad Risqi U. | Monash University, Indonesia |
| Lu, Chris Xiaoxuan | University of Edinburgh |
| Trigoni, Niki | University of Oxford |
| Markham, Andrew | Oxford University |
Keywords: Localization, Computer Vision for Transportation
Abstract: Camera localization is a fundamental problem for many applications in computer vision, robotics, and autonomy. Despite recent deep learning-based approaches, the lack of robustness in challenging conditions persists due to changes in appearance caused by texture-less planes, repeating structures, reflective surfaces, motion blur, and illumination changes. Data augmentation is an attractive solution, but standard image perturbation methods fail to improve localization robustness. To address this, we propose RADA, which concentrates on perturbing the most vulnerable pixels to generate relatively less image perturbations that perplex the network. Our method outperforms previous augmentation techniques, achieving up to twice the accuracy of state-of-the-art models even under 'unseen' challenging weather conditions.
|
| |
| 14:36-14:42, Paper MoBT14.7 | Add to My Program |
| MagHT: A Magnetic Hough Transform for Fast Indoor Place Recognition |
|
| Abdul Raouf, Iad | CEA List |
| Gay-Bellile, Vincent | CEA LIST |
| Bourgeois, Steve | CEA LIST |
| Joly, Cyril | Mines ParisTech, PSL Research University |
| Paljic, Alexis | Mines ParisTech |
Keywords: Localization, Recognition, SLAM
Abstract: This article proposes a novel indoor magnetic field-based place recognition algorithm that is accurate and fast to compute. For that, we modified the generalized "Hough Transform" to process magnetic data (MagHT). It takes as input a sequence of magnetic measures whose relative positions are recovered by an odometry system and recognizes the places in the magnetic map where they were acquired. It also returns the global transformation from the coordinate frame of the input magnetic data to the magnetic map reference frame. Experimental results on several real datasets in large indoor environments demonstrate that the obtained localization error, recall, and precision are similar to or are better than state-of-the-art methods while improving the runtime by several orders of magnitude. Moreover, unlike magnetic sequence matching-based solutions such as DTW, our approach is independent of the path taken during the magnetic map creation.
|
| |
| 14:42-14:48, Paper MoBT14.8 | Add to My Program |
| What to Learn: Features, Image Transformations, or Both? |
|
| Chen, Yuxuan | University of Toronto |
| Xu, Binbin | University of Toronto |
| D�mbgen, Frederike | University of Toronto |
| Barfoot, Timothy | University of Toronto |
Keywords: Localization, Deep Learning for Visual Perception, Vision-Based Navigation
Abstract: Long-term visual localization is an essential problem in robotics and computer vision, but remains challenging due to the environmental appearance changes caused by lighting and seasons. While many existing works have attempted to solve it by directly learning invariant sparse keypoints and descriptors to match scenes, these approaches still struggle with adverse appearance changes. Recent developments in image transformations such as neural style transfer have emerged as an alternative to address such appearance gaps. In this work, we propose to combine an image transformation network and a feature-learning network to improve long-term localization performance. Given night-to-day image pairs, the image transformation network transforms the night images into day-like conditions prior to feature matching; the feature network learns to detect keypoint locations with their associated descriptor values, which can be passed to a classical pose estimator to compute the relative poses. We conducted various experiments to examine the effectiveness of combining style transfer and feature learning and its training strategy, showing that such a combination greatly improves long-term localization performance.
|
| |
| 14:48-14:54, Paper MoBT14.9 | Add to My Program |
| Global Localization: Utilizing Relative Spatio-Temporal Geometric Constraints from Adjacent and Distant Cameras |
|
| Altillawi, Mohammad | Huawei, Autonomous University of Barcelona, |
| Pataki, Zador | ETH Zurich |
| Li, Shile | Algolux Germany |
| Liu, Ziyuan | Huawei Group |
Keywords: Localization, Vision-Based Navigation, Virtual Reality and Interfaces
Abstract: Re-localizing a camera from a single image in a previously mapped area is vital for many computer vision applications in robotics and augmented/virtual reality. In this work, we address the problem of estimating the 6 DoF camera pose relative to a global frame from a single image. We propose to leverage a novel network of relative spatial and temporal geometric constraints to guide the training of a Deep Network for Localization. We employ simultaneously spatial and temporal relative pose constraints that are obtained not only from adjacent camera frames but also from camera frames that are distant in the spatio-temporal space of the scene. We show that our method, through these constraints, is capable of learning to localize when little or very sparse ground-truth 3D coordinates are available. In our experiments, this is less than 1% of available ground-truth data. We evaluate our method on 3 common visual localization datasets and show that it outperforms other direct pose estimation methods.
|
| |
| 14:54-15:00, Paper MoBT14.10 | Add to My Program |
| Uncertainty-Aware Lidar Place Recognition in Novel Environments |
|
| Mason, Keita | CSIRO |
| Knights, Joshua Barton | Queensland University of Technology |
| Ramezani, Milad | CSIRO |
| Moghadam, Peyman | CSIRO |
| Miller, Dimity | Queensland University of Technology |
Keywords: Localization, Deep Learning for Visual Perception, Recognition
Abstract: State-of-the-art lidar place recognition models exhibit unreliable performance when tested on environments different from their training dataset, which limits their use in complex and evolving environments. To address this issue, we investigate the task of uncertainty-aware lidar place recognition, where each predicted place must have an associated uncertainty that can be used to identify and reject incorrect predictions. We introduce a novel evaluation protocol and present the first comprehensive benchmark for this task, testing across five uncertainty estimation techniques and three large-scale datasets. Our results show that an Ensembles approach is the highest performing technique, consistently improving the performance of lidar place recognition and uncertainty estimation in novel environments, though it incurs a computational cost. Code is publicly available at https://github.com/csiro-robotics/Uncertainty-LPR.
|
| |
| 15:00-15:06, Paper MoBT14.11 | Add to My Program |
| Hot-NetVLAD: Learning Discriminatory Key Points for Visual Place Recognition |
|
| Li, Zhikai | National University of Singapore |
| Lee, Christina Dao Wen | National University of Singapore |
| Tung, Beatrix | Singapore-MIT Alliance for Research and Technology |
| Huang, Zefan | National University of Singapore |
| Rus, Daniela | MIT |
| Ang Jr, Marcelo H | National University of Singapore |
Keywords: Localization, Vision-Based Navigation, Intelligent Transportation Systems
Abstract: Hot-NetVLAD implements a hot-spot detector on a learned local key-patch descriptor algorithm for Visual Place Recognition (VPR), thereby greatly cutting down the size of features extracted. The hot-spots pinpoint which regions are crucial for comparison when performing VPR. As hot-spots land on only a small portion of the feature space, the number of local descriptors extracted is greatly reduced. A novel method to extract ground truths of hot-spots in the context of VPR is proposed so that the hot-spot detector in Hot-NetVLAD can be trained for VPR purposes. Hot-NetVLAD is evaluated on the Pittsburgh250k and Tokyo24/7 datasets. While results show that Hot-NetVLAD trades some accuracy loss for storage efficiency, the recall remains competitive when compared to state-of- the-art methods. Furthermore, identified hot-spots bring new insights to key regions required for VPR, as they tend to fall on distinguishable static objects in the scene. This can potentially be applied to increase the robustness of mobile robot localization by increasing resilience to dynamic environments, whilst still being able to perform static obstacle matching effectively.
|
| |
| 15:06-15:12, Paper MoBT14.12 | Add to My Program |
| Data-Driven Based Cascading Orientation and Translation Estimation for Inertial Navigation |
|
| Deng, Xiangyu | OPPO |
| Wang, Shenyue | OPPO |
| Shan, ChunXiang | OPPO |
| Lu, Jinjie | OPPO |
| Jin, Ke | OPPO |
| Li, Jijunnan | OPPO Research Institute |
| Guo, Yandong | OPPO Research Institute |
Keywords: Localization, AI-Based Methods
Abstract: Recently, data-driven approaches have brought both opportunities and challenges for Inertial Navigation Systems. In this paper, we propose a novel data-driven method which is composed of cascading orientation and translation estimation with IMU-only measurements. For robust orientation estimation, we combine a CNN-based neural network with an EKF to eliminate orientation errors caused by sensor noises. We additionally propose a hybrid CNN-Transformer-based neural network which exploits both spatial and long-term temporal information to regress accurate translations. Specifically, we conduct detailed evaluations on datasets acquired by iPhone and Android devices. The result demonstrates that our method outperforms state-of-the-art methods in both orientation and translation errors.
|
| |
| 15:12-15:18, Paper MoBT14.13 | Add to My Program |
| FE-Fusion-VPR: Attention-Based Multi-Scale Network Architecture for Visual Place Recognition by Fusing Frames and Events |
|
| Hou, Kuanxu | Northeastern University |
| Kong, Delei | Northeastern University (China) |
| Jiang, Junjie | Northeastern University |
| Zhuang, Hao | Northeastern University |
| Huang, Xinjie | Northeastern University, China |
| Fang, Zheng | Northeastern University |
Keywords: Localization, Recognition, Deep Learning Methods
Abstract: Traditional visual place recognition (VPR), usually using standard cameras, is easy to fail due to glare or high-speed motion. By contrast, event cameras have the advantages of low latency, high temporal resolution, and high dynamic range, which can deal with the above issues. Nevertheless, event cameras are prone to failure in motionless scenes, while standard cameras can still provide appearance information in this case. Thus, exploiting the complementarity of standard cameras and event cameras can effectively improve the performance of VPR algorithms. In the paper, we propose FE-Fusion-VPR, an attention-based multi-scale network architecture for VPR by fusing frames and events. First, the intensity frame and event volume are fed into the two-stream feature extraction network for shallow feature fusion. Next, the three-scale features are obtained through the multi-scale fusion network and aggregated into three sub-descriptors using the VLAD layer. Finally, the weight of each sub-descriptor is learned through the descriptor re-weighting network to obtain the final refined descriptor. Experimental results show that our FE-Fusion-VPR outperforms existing frame-based, event-based and fusion-based VPR methods in most cases on Brisbane-Event-VPR and DDD20 datasets. In a word, compared to the previous works, our FE-Fusion-VPR achieves new state-of-the-art (SOTA) VPR performance in Brisbane-Event-VPR and DDD20 datasets by fusing frames and events.
|
| |
| MoBT15 Regular session, 321 |
Add to My Program |
| Visual SLAM |
|
| |
| Chair: Rosinol, Antoni | MIT |
| Co-Chair: Kim, Donghyun | University of Massachusetts Amherst |
| |
| 14:00-14:06, Paper MoBT15.1 | Add to My Program |
| Self-Supervised Domain Calibration and Uncertainty Estimation for Place Recognition |
|
| Lajoie, Pierre-Yves | �cole Polytechnique De Montr�al |
| Beltrame, Giovanni | Ecole Polytechnique De Montreal |
Keywords: SLAM, Deep Learning for Visual Perception
Abstract: Visual place recognition techniques based on deep learning, which have imposed themselves as the state-of-the-art in recent years, do not generalize well to environments visually different from the training set. Thus, to achieve top performance, it is sometimes necessary to fine-tune the networks to the target environment. To this end, we propose a self-supervised domain calibration procedure based on robust pose graph optimization from Simultaneous Localization and Mapping (SLAM) as the supervision signal without requiring GPS or manual labeling. Moreover, we leverage the procedure to improve uncertainty estimation for place recognition matches which is important in safety critical applications. We show that our approach can improve the performance of a state-of-the-art technique on a target environment dissimilar from its training set and that we can obtain uncertainty estimates. We believe that this approach will help practitioners to deploy robust place recognition solutions in real-world applications. Our code is available publicly: https://github.com/MISTLab/vpr-calibration-and-uncertainty
|
| |
| 14:06-14:12, Paper MoBT15.2 | Add to My Program |
| ISimLoc: Visual Global Localization for Previously Unseen Environments with Simulated Images (I) |
|
| Yin, Peng | City University of Hong Kong |
| Cisneros, Ivan | Carnegie Mellon University |
| Zhao, Shiqi | University of California San Diego |
| Zhang, Ji | Carnegie Mellon University |
| Choset, Howie | CMU |
| Scherer, Sebastian | Carnegie Mellon University |
Keywords: SLAM, Localization, Visual-Based Navigation, Visual Global Localization
Abstract: The camera is an attractive device for use in beyond visual line of sight drone operation since cameras are low in size, weight, power, and cost. However, state-of-the-art visual localization algorithms have trouble matching visual data that have significantly different appearances due to changes in illumination or viewpoint. This paper presents iSimLoc, a learning-based global re-localization approach that is robust to appearance and viewpoint differences. The features learned by iSimLoc's place recognition network can be utilized to match query images to reference images of a different stylistic domain and viewpoint. Additionally, our hierarchical global re-localization module searches in a coarse-to-fine manner, allowing iSimLoc to perform fast and accurate pose estimation. We evaluate our method on a dataset with appearance variations and a dataset that focuses on demonstrating large-scale matching over a long flight over complex terrain. iSimLoc achieves 88.7% and 83.8% successful retrieval rates on our two datasets, with 1.5s inference time, compared to 45.8% and 39.7% using the next best method. These results demonstrate robust localization in a range of environments a
|
| |
| 14:12-14:18, Paper MoBT15.3 | Add to My Program |
| Converting Depth Images and Point Clouds for Feature-Based Pose Estimation |
|
| L�sch, Robert | TU Bergakademie Freiberg |
| Sastuba, Mark | Federal Railway Authority Germany |
| Toth, Jonas | TU Bergakademie Freiberg |
| Jung, Bernhard | TU Bergakademie Freiberg |
Keywords: Recognition, RGB-D Perception
Abstract: In recent years, depth sensors have become more and more affordable and have found their way into a growing amount of robotic systems. However, mono- or multi-modal sensor registration, often a necessary step for further processing, faces many challenges on raw depth images or point clouds. This paper presents a method of converting depth data into images capable of visualizing spatial details that are basically hidden in traditional depth images. After noise removal, a neighborhood of points forms two normal vectors whose difference is encoded into this new conversion. Compared to Bearing Angle images, our method yields brighter, higher-contrast images with more visible contours and more details. We tested feature-based pose estimation of both conversions in a visual odometry task and RGB-D SLAM. For all tested features, AKAZE, ORB, SIFT, and SURF, our new Flexion images yield better results than Bearing Angle images and show great potential to bridge the gap between depth data and classical computer vision. Source code is available here: https://rlsch.github.io/depth-flexion-conversion.
|
| |
| 14:18-14:24, Paper MoBT15.4 | Add to My Program |
| AirVO: An Illumination-Robust Point-Line Visual Odometry |
|
| Xu, Kuan | NTU |
| Hao, Yuefan | Geekplus Corp |
| Yuan, Shenghai | Nanyang Technological University |
| Wang, Chen | State University of New York at Buffalo |
| Xie, Lihua | NanyangTechnological University |
Keywords: SLAM, Localization
Abstract: This paper proposes an illumination-robust visual odometry (VO) system that incorporates both accelerated learning-based corner point algorithms and an extended line feature algorithm. To be robust to dynamic illumination, the proposed system employs convolutional neural networks (CNN) to detect and match reliable and informative corner points. Then point feature matching results and the distribution of point and line features are utilized to match and triangulate lines. By accelerating CNN parts and optimizing the pipeline, the proposed system is able to run in real-time on low-power embedded platforms. The proposed VO was evaluated on several datasets with varying illumination conditions, and the results show that it outperforms other state-of-the-art VO and VIO systems in terms of accuracy and robustness. The open-source nature of the proposed system allows for easy implementation and customization by the research community, enabling further development and improvement of VO for various applications.
|
| |
| 14:24-14:30, Paper MoBT15.5 | Add to My Program |
| NeRF-SLAM: Real-Time Dense Monocular SLAM with Neural Radiance Fields |
|
| Rosinol, Antoni | MIT |
| Carlone, Luca | Massachusetts Institute of Technology |
| Leonard, John | MIT |
Keywords: Mapping, Localization, SLAM
Abstract: We propose a novel geometric and photometric 3D mapping pipeline for accurate and real-time scene reconstruction from casually taken monocular images. To achieve this, we leverage recent advances in dense monocular SLAM and real-time hierarchical volumetric neural radiance fields. Our insight is that dense monocular SLAM provides the right information to fit a neural radiance field of the scene in real-time, by providing accurate pose estimates and depth-maps with associated uncertainty. Our proposed pipeline achieves better geometric and photometric accuracy than competing approaches (up to 178% better PSNR and 75% better L1 depth), while working in real-time and using only monocular images.
|
| |
| 14:30-14:36, Paper MoBT15.6 | Add to My Program |
| Scale Jump-Aware Pose Graph Relaxation for Monocular SLAM with Re-Initializations |
|
| Yuan, Runze | Shanghaitech |
| Cheng, Ran | Midea Robozone |
| Lige, Liu | Midea Group |
| Sun, Tao | Massachusetts Institute of Technology |
| Kneip, Laurent | ShanghaiTech University |
Keywords: SLAM, Localization
Abstract: Pose graph relaxation has become an indispensable addition to SLAM enabling efficient global registration of sensor reference frames under the objective of satisfying pair-wise relative transformation constraints. The latter may be given by incremental motion estimation or global place recognition. While the latter case enables loop closures and drift compensation, care has to be taken in the monocular case in which local estimates of structure and displacements can differ from reality not just in terms of noise, but also in terms of a scale factor. Owing to the accumulation of scale propagation errors, this scale factor is drifting over time, hence scale-drift aware pose graph relaxation has been introduced. We extend this idea to cases in which the relative scale between subsequent sensor frames is unknown, a situation that can easily occur if monocular SLAM enters re-initialization and no reliable overlap between successive local maps can be identified. The approach is realized by a hybrid pose graph formulation that combines the regular similarity consistency terms with novel, scale-blind constraints. We apply the technique to the practically relevant case of small indoor service robots capable of effectuating purely rotational displacements, a condition that can easily cause tracking failures. We demonstrate that globally consistent trajectories can be recovered even if multiple re-initializations occur along the loop, and present an in-depth study of success and failure cases.
|
| |
| 14:36-14:42, Paper MoBT15.7 | Add to My Program |
| Optimizing the Extended Fourier Mellin Transformation Algorithm |
|
| Jiang, Wenqing | ShanghaiTech University |
| Li, Chengqian | ShanghaiTech University |
| Cao, Jinyue | Shanghaitech University |
| Schwertfeger, S�ren | ShanghaiTech University |
Keywords: SLAM, Computer Vision for Automation
Abstract: With the increasing application of robots, stable and efficient Visual Odometry (VO) algorithms are becoming more and more important. Based on the Fourier Mellin Transformation (FMT) algorithm, the extended Fourier Mellin Transformation (eFMT) is an image registration approach that can be applied to downward-looking cameras, for example on aerial and underwater vehicles. eFMT extends FMT to multi-depth scenes and thus more application scenarios. It is a visual odometry method which estimates the pose transformation between three overlapping images. On this basis, we develop an optimized eFMT algorithm that improves certain aspects of the method and combines it with back-end optimization for the small loop of three consecutive frames. For this we investigate the extraction of uncertainty information from the eFMT registration, the related objective function and the graph-based optimization. Finally, we design a series of experiments to investigate the properties of this approach and compare it with other VO and SLAM (Simultaneous Localization and Mapping) algorithms. The results show the superior accuracy and speed of our o-eFMT approach, which is published as open source.
|
| |
| 14:42-14:48, Paper MoBT15.8 | Add to My Program |
| Marker-Based Visual SLAM Leveraging Hierarchical Representations |
|
| Tourani, Ali | University of Luxembourg |
| Bavle, Hriday | University of Luxembourg |
| Sanchez-Lopez, Jose Luis | Interdisciplinary Center for Security, Reliability and Trust (Sn |
| Munoz Salinas, Rafael | University of Cordoba, Spain |
| Voos, Holger | University of Luxembourg |
Keywords: SLAM, Visual-Inertial SLAM, Mapping
Abstract: Fiducial markers can encode rich information about the environment and aid Visual SLAM (VSLAM) approaches in reconstructing maps with practical semantic information. Current marker-based VSLAM approaches mainly utilize markers for improving feature detections in low-feature environments and/or incorporating loop closure constraints, generating only low-level geometric maps of the environment prone to inaccuracies in complex environments. To bridge this gap, this paper presents a VSLAM approach utilizing a monocular camera along with fiducial markers to generate hierarchical representations of the environment while improving the camera pose estimate. The proposed approach detects semantic entities from the surroundings, including walls, corridors, and rooms encoded within markers, and appropriately adds topological constraints among them. Experimental results on a real-world dataset collected with a robot demonstrate that the proposed approach outperforms a marker-based VSLAM baseline in terms of accuracy, given the addition of new constraints while creating enhanced map representations. Furthermore, it shows satisfactory results when comparing the reconstructed map quality to the one rebuilt using a LiDAR SLAM approach.
|
| |
| 14:48-14:54, Paper MoBT15.9 | Add to My Program |
| RVWO: A Robust Visual-Wheel SLAM System for Mobile Robots in Dynamic Environments |
|
| Mahmoud, Jaafar | ITMO University |
| Penkovskiy, Andrey | ITMO University |
| Ha, The Long Vuong | ITMO University |
| Burkov, Aleksei | Sber Robotics Laboratory |
| Kolyubin, Sergey | ITMO University |
Keywords: SLAM, Sensor Fusion, Wheeled Robots
Abstract: This paper presents RVWO, a system designed to provide robust localization and mapping for wheeled mobile robots in challenging scenarios. The proposed approach leverages a probabilistic framework that incorporates semantic prior information about landmarks and visual re-projection error to create a landmark reliability model, which acts as an adaptive kernel for the visual residuals in optimization. Additionally, we fuse visual residuals with wheel odometry measurements, taking advantage of the planar motion assumption. The RVWO system is designed to be robust against wrong data association due to moving objects, poor visual texture, bad illumination, and wheel slippage. Evaluation results demonstrate that the proposed system shows competitive results in dynamic environments and outperforms existing approaches on both public benchmarks and our custom hardware setup. We also provide the code as an open-source contribution to the robotics community
|
| |
| 14:54-15:00, Paper MoBT15.10 | Add to My Program |
| Event Camera-Based Visual Odometry for Dynamic Motion Tracking of a Legged Robot Using Adaptive Time Surface |
|
| Zhu, Shifan | University of Massachusetts Amherst |
| Tang, Zhipeng | University of Massachusetts Amherst |
| Yang, Michael | University of Massachusetts Amherst |
| Learned-Miller, Erik | University of Massachusetts, Amherst |
| Kim, Donghyun | University of Massachusetts Amherst |
Keywords: SLAM, Legged Robots, Localization
Abstract: Our paper proposes a direct sparse visual odometry method that combines event and RGB-D data to estimate the pose of agile-legged robots during dynamic locomotion and acrobatic behaviors. Event cameras offer high temporal resolution and dynamic range, which can eliminate the issue of blurred RGB images during fast movements. This unique strength holds a potential for accurate pose estimation of agile-legged robots, which has been a challenging problem to tackle. Our framework leverages the benefits of both RGB- D and event cameras to achieve robust and accurate pose estimation, even during dynamic maneuvers such as jumping and landing of a quadruped robot, Mini-Cheetah. Our major contributions are threefold: Firstly, we introduce an adaptive time surface (ATS) method that addresses the whiteout and blackout issue in common time surfaces by formulating pixel-wise decay rates based on scene complexity and motion speed. Secondly, we develop an effective pixel selection method that directly samples from event data and applies sample filtering through ATS, enabling us to pick pixels on distinct features. Lastly, we propose a nonlinear pose optimization formula that simultaneously performs 3D-2D alignment on both RGB-based and event-based maps and images, allowing the algorithm to fully exploit the benefits of both data streams. We extensively evaluate the performance of our framework on both public datasets and our own quadruped robot dataset, demonstrating its effectiveness in accurately estimating the pose of agile robots during dynamic movements.
|
| |
| 15:00-15:06, Paper MoBT15.11 | Add to My Program |
| Enhancing Robustness of Line Tracking through Semi-Dense Epipolar Search in Line-Based SLAM |
|
| Seo, Dong-Uk | Korea Advanced Institute of Science and Technology |
| Lim, Hyungtae | Korea Advanced Institute of Science and Technology |
| Lee, Eungchang Mason | Korea Advanced Institute of Science and Technology |
| Lim, Hyunjun | Korea Advanced Institute of Science and Technology |
| Myung, Hyun | KAIST (Korea Advanced Institute of Science and Technology) |
Keywords: Visual-Inertial SLAM, Visual Tracking, SLAM
Abstract: Line information from urban structures can be exploited as an additional geometrical feature to achieve robust vision-based simultaneous localization and mapping (SLAM) systems in textureless scenes. Sometimes, however, conventional line tracking methods fail to track caused by image blur or occlusion. Even though these lost line features are just a subset of plenty of features, the failure in feature tracking can potentially lead to performance degradation of the SLAM system, particularly in textureless environments. To tackle this problem, we propose a robust line-tracking method for line-based monocular visual-inertial odometry. The proposed method generates a semi-dense map composed of depth and sparsity mesh using estimated 3D features. By leveraging this semi-dense map, our method performs a range-adaptive epipolar search to match the lines, allowing for robust line tracking while simultaneously reducing false positives. Furthermore, an algorithm to avoid conflicts is proposed, which occurs when the tracked lines from consecutive matching do not accord with the lines matched by our method. This algorithm discriminately maintains line features while appropriately aggregating lines spread across multiple frames. As evaluated in the EuRoC dataset and a more challenging textureless corridor scene, our proposed method shows substantial performance increases compared with other line-based visual (-inertial) approaches.
|
| |
| 15:06-15:12, Paper MoBT15.12 | Add to My Program |
| Stereo Visual Odometry with Deep Learning-Based Point and Line Feature Matching Using an Attention Graph Neural Network |
|
| Kannapiran, Shenbagaraj | Arizona State University |
| Bendapudi, Nalin | Ford Motor Company |
| Yu, Ming-Yuan | University of Michigan |
| Parikh, Devarth | Ford Motor Company |
| Berman, Spring | Arizona State University |
| Vora, Ankit | Ford Motor Company |
| Pandey, Gaurav | Ford Motor Company |
Keywords: SLAM, Localization
Abstract: Robust feature matching forms the backbone for most Visual Simultaneous Localization and Mapping (vSLAM), visual odometry, 3D reconstruction, and Structure from Motion (SfM) algorithms. However, recovering feature matches from texture-poor scenes is a major challenge and still remains an open area of research. In this paper, we present a Stereo Visual Odometry (SVO) technique based on point and line features which uses a novel feature-matching mechanism based on an Attention Graph Neural Network that is designed to perform well even under adverse weather conditions such as fog, haze, rain, and snow, and dynamic lighting conditions such as nighttime illumination and glare scenarios. We perform experiments on multiple real and synthetic datasets to validate our method�s ability to perform SVO under low-visibility weather and lighting conditions through robust point and line matches. The results demonstrate that our method achieves more line feature matches than state-of-the-art line-matching algorithms, which when complemented with point feature matches perform consistently well in adverse weather and dynamic lighting conditions.
|
| |
| 15:12-15:18, Paper MoBT15.13 | Add to My Program |
| SID-SLAM: Semi-Direct Information-Driven RGB-D SLAM |
|
| Fontan, Alejandro | Queensland University of Technology |
| Giubilato, Riccardo | German Aerospace Center (DLR) |
| Oliva, Maza, Laura | German Aeroespace Center (DLR) |
| Civera, Javier | Universidad De Zaragoza |
| Triebel, Rudolph | German Aerospace Center (DLR) |
Keywords: SLAM, Localization, Data Sets for SLAM
Abstract: This work presents SID-SLAM, a complete SLAM framework for RGB-D cameras. Our main contribution is a semi-direct approach that, for the first time, combines tightly and indistinctly photometric and feature-based image measurements. Additionally, SID-SLAM uses information metrics to reduce the state size with a minimal impact in the accuracy. Our evaluation on several public datasets shows that we achieve state-of-the-art performance regarding accuracy, robustness and computational footprint in CPU real time. In order to facilitate research on semi-direct SLAM, we record the Minimal Texture dataset, composed by RGB-D sequences that are challenging for current baselines and in which our pipeline excels.
|
| |
| MoBT16 Regular session, 330A |
Add to My Program |
| AI-Enabled Robotics |
|
| |
| Chair: Fukuchi, Yosuke | National Institute of Informatics |
| Co-Chair: Kesavadas, Thenkurussi | University of Illinois at Urbana-Champaign |
| |
| 14:00-14:06, Paper MoBT16.1 | Add to My Program |
| The Design, Education and Evolution of a Robotic Baby (I) |
|
| Zhu, Hanqing | Georgia Institute of Technology |
| Wilson, Sean | Georgia Institute of Technology, Georgia Tech Research Institute |
| Feron, Eric | Georgia Institute of Technology |
Keywords: Learning and Adaptive Systems, AI-Based Methods, Control Architectures and Programming, Natural Language Acquisition and Programming
Abstract: Inspired by Alan Turing�s idea of a child machine, we introduce the formal definition of a robotic baby, an integrated system with minimal world knowledge at birth, capable of learning incrementally and interactively, and adapting to the world. Within the definition, fundamental capabilities and system characteristics of the robotic baby are identified and presented as the system-level requirements. As a minimal viable prototype, the Baby architecture is proposed with a systems engineering design approach to satisfy the system-level requirements, which has been verified and validated with simulations and experiments on a robotic system. We demonstrate the capabilities of the robotic baby in natural language acquisition and semantic parsing in English and Chinese, as well as in natural language grounding, natural language reinforcement learning, natural language programming and system introspection for explainability. The education and evolution of the robotic baby are illustrated with real world robotic demonstrations. Inspired by the genetic inheritance in human beings, knowledge inheritance in robotic babies and its benefits regarding evolution are discussed.
|
| |
| 14:06-14:12, Paper MoBT16.2 | Add to My Program |
| Selective Presentation of AI Object Detection Results While Maintaining Human Reliance |
|
| Fukuchi, Yosuke | National Institute of Informatics |
| Yamada, Seiji | National Institute of Informatics |
Keywords: Acceptability and Trust, Intelligent Transportation Systems, AI-Based Methods
Abstract: Transparency in decision-making is an important factor for AI-driven autonomous systems to be trusted and relied on by users. Studies in the field of visual information processing typically attempt to make an AI system's behavior transparent by showing bounding boxes or heatmaps as explanations. However, it has also been found that an excessive amount of explanations sometimes causes information overload and brings negative results. This paper proposes SmartBBox, a method for reducing the number of bounding boxes to show while maintaining human reliance on an AI. It infers if each bounding box is worth showing by predicting its effect on human reliance. SmartBBox can autonomously learn to decide whether to show bounding boxes from humans' usage data. We implemented and tested SmartBBox in an autonomous driving scenario in which a human continuously decides whether to rely on an autonomous driving system while observing the dynamic results of object detection by the system. The results suggest that SmartBBox can reduce bounding boxes 64.8% on average from object recognition results while keeping human reliance at the same level as in the case where all the bounding boxes are presented.
|
| |
| 14:12-14:18, Paper MoBT16.3 | Add to My Program |
| Ego-Noise Reduction of a Mobile Robot Using Noise Spatial Covariance Matrix Learning and Minimum Variance Distortionless Response |
|
| Lagac�, Pierre-Olivier | Universit� De Sherbrooke |
| Ferland, Fran�ois | Universit� De Sherbrooke |
| Grondin, Francois | Universit� De Sherbrooke |
Keywords: Robot Audition
Abstract: The performance of speech and events recognition systems significantly improved recently thanks to deep learning methods. However, some of these tasks remain challenging when algorithms are deployed on robots due to the unseen mechanical noise and electrical interference generated by their actuators while training the neural networks. Ego-noise reduction as a preprocessing step therefore can help solve this issue when using pre-trained speech and event recognition algorithms on robots. In this paper, we propose a new method to reduce ego-noise using only a microphone array and less than two minute of noise recordings. Using Principal Component Analysis (PCA), the best covariance matrix candidate is selected from a dictionary created online during calibration and used with the Minimum Variance Distortionless Response (MVDR) beamformer. Results show that the proposed method runs in real-time, improves the signal-to-distortion ratio (SDR) by up to 10 dB, decreases the word error rate (WER) by 55% in some cases and increases the Average Precision (AP) of event detection by up to 0.2.
|
| |
| 14:18-14:24, Paper MoBT16.4 | Add to My Program |
| Extracting Dynamic Navigation Goal from Natural Language Dialogue |
|
| Liang, Lanjun | Shanghai Institute of Technology |
| Bian, Ganghui | Yantai University, Yantai, P.R. China |
| Zhao, Huailin | Shanghai Institute of Technology |
| Dong, Yanzhi | Yantai University |
| Liu, Huaping | Tsinghua University |
Keywords: AI-Enabled Robotics, Natural Dialog for HRI, Human-Robot Collaboration
Abstract: Effective access to relevant environmental changes in large human environments is critical for service robots to perform tasks. Since the position of a dynamic goal such as a human is variable, it will be difficult for the robot to locate him accurately. It is worth noting that humans can obtain information through social software, and deal with daily affairs. The current robots search for targets without considering some implicit information changes, which leads to not searching for the target objects in the end. Therefore, we propose to extract human implicit location change information from group chats dialogues, i.e., watching dialogues in group chats and extracting who, when, and where(3W), to assist robots in finding explicit character targets. Then we propose a dynamic spatio-temporal map(DSTM) to store the change information as knowledge for the robot. When the robot identifies a target person, it needs to follow the changing information in the scene to infer the possible location and probability of the target person, and then develop a search strategy. We deployed our framework on a custom mobile robot and performed instruction navigation tasks in a university building to evaluate our approach. We demonstrate the ability of our framework to collect and use information in a large human social environment.
|
| |
| 14:24-14:30, Paper MoBT16.5 | Add to My Program |
| TidyBot: Personalized Robot Assistance with Large Language Models |
|
| Wu, Jimmy | Princeton University |
| Antonova, Rika | Stanford University |
| Kan, Adam | The Nueva School |
| Lepert, Marion | Stanford University |
| Zeng, Andy | Google DeepMind |
| Song, Shuran | Columbia University |
| Bohg, Jeannette | Stanford University |
| Rusinkiewicz, Szymon | Princeton University |
| Funkhouser, Thomas A. | Princeton University |
Keywords: Service Robotics, Mobile Manipulation, AI-Enabled Robotics
Abstract: For a robot to personalize physical assistance effectively, it must learn user preferences that can be generally reapplied to future scenarios. In this work, we investigate personalization of household cleanup with robots that can tidy up rooms by picking up objects and putting them away. A key challenge is determining the proper place to put each object, as people's preferences can vary greatly depending on personal taste or cultural background. For instance, one person may prefer storing shirts in the drawer, while another may prefer them on the shelf. We aim to build systems that can learn such preferences from just a handful of examples via prior interactions with a particular person. We show that robots can combine language-based planning and perception with the few-shot summarization capabilities of large language models (LLMs) to infer generalized user preferences that are broadly applicable to future interactions. This approach enables fast adaptation and achieves 91.2% accuracy on unseen objects in our benchmark dataset. We also demonstrate our approach on a real-world mobile manipulator called TidyBot, which successfully puts away 85.0% of objects in real-world test scenarios.
|
| |
| 14:30-14:36, Paper MoBT16.6 | Add to My Program |
| L3MVN: Leveraging Large Language Models for Visual Target Navigation |
|
| Yu, Bangguo | University of Groningen |
| Kasaei, Hamidreza | University of Groningen |
| Cao, Ming | University of Groningen |
Keywords: Vision-Based Navigation, AI-Enabled Robotics, Service Robotics
Abstract: Visual target navigation in unknown environments is a crucial problem in robotics. Despite extensive investigation of classical and learning-based approaches in the past, robots lack common-sense knowledge about household objects and layouts. Prior state-of-the-art approaches to this task rely on learning the priors during the training and typically require significant expensive resources and time for learning. To address this, we propose a new framework for visual target navigation that leverages Large Language Models (LLM) to impart common sense for object searching. Specifically, we introduce two paradigms: (i) zero-shot and (ii) feed-forward approaches that use language to find the relevant frontier from the semantic map as a long-term goal and explore the environment efficiently. Our analyses demonstrate the notable zero-shot generalization and transfer capabilities from the use of language. Experiments on Gibson and Habitat-Matterport 3D (HM3D) demonstrate that the proposed framework significantly outperforms existing map-based methods in terms of success rate and generalization. Ablation analysis also indicates that the common-sense knowledge from the language model leads to more efficient semantic exploration. Finally, we provide a real robot experiment to verify the applicability of our framework in real-world scenarios. The supplementary video and code can be accessed via the following link: https://sites.google.com/view/l3mvn.
|
| |
| 14:36-14:42, Paper MoBT16.7 | Add to My Program |
| TopSpark: A Timestep Optimization Methodology for Energy-Efficient Spiking Neural Networks on Autonomous Mobile Agents |
|
| Putra, Rachmad Vidya Wicaksana | Technische Universit�t Wien (TU Wien) |
| Shafique, Muhammad | New York University Abu Dhabi |
Keywords: AI-Enabled Robotics, Engineering for Robotic Systems, Autonomous Agents
Abstract: Autonomous mobile agents (e.g., mobile ground robots and UAVs) typically require low-power/energy-efficient machine learning (ML) algorithms to complete their ML-based tasks (e.g., object recognition) while adapting to diverse environments, as mobile agents are usually powered by batteries. These requirements can be fulfilled by Spiking Neural Networks (SNNs) as they offer low power/energy processing due to their sparse computations and efficient online learning with bio-inspired learning mechanisms for adapting to different environments. Recent works studied that the energy consumption of SNNs can be optimized by reducing the computation time of each neuron for processing a sequence of spikes (i.e., timestep). However, state-of-the-art techniques rely on intensive design searches to determine fixed timestep settings for only the inference phase, thereby hindering the SNN systems from achieving further energy efficiency gains in both the training and inference phases. These techniques also restrict the SNN systems from performing efficient online learning at run time. Toward this, we propose TopSpark, a novel methodology that leverages adaptive timestep reduction to enable energy-efficient SNN processing in both the training and inference phases, while keeping its accuracy close to the accuracy of SNNs without timestep reduction. The key ideas of our TopSpark include: (1) analyzing the impact of different timestep settings on the accuracy; (2) identifying neuron parameters that have a significant impact on accuracy in different timesteps; (3) employing parameter enhancements that make SNNs effectively perform learning and inference using less spiking activity due to reduced timesteps; and (4) developing a strategy to trade-off accuracy, latency, and energy to meet the design requirements. The experimental results show that, our TopSpark saves the SNN latency by 3.9x as well as energy consumption by 3.5x for training and 3.3x for inference on average, across different network sizes, learning rules, and workloads, while maintaining the accuracy within 2% of that of SNNs without timestep reduction. In this manner, TopSpark enables low-power/energy-efficient SNN processing for autonomous mobile agents.
|
| |
| 14:42-14:48, Paper MoBT16.8 | Add to My Program |
| Generating Executable Action Plans with Environmentally-Aware Language Models |
|
| Gramopadhye, Maitrey | University of North Carolina at Chapel Hill |
| Szafir, Daniel J. | University of North Carolina at Chapel Hill |
Keywords: AI-Enabled Robotics, Deep Learning Methods, Task Planning
Abstract: Large Language Models (LLMs) trained using massive text datasets have recently shown promise in generating action plans for robotic agents from high level text queries. However, these models typically do not consider the robot�s environment, resulting in generated plans that may not actually be executable, due to ambiguities in the planned actions or environmental constraints. In this paper, we propose an approach to generate environmentally-aware action plans that agents are better able to execute. Our approach involves integrating environmental objects and object relations as additional inputs into LLM action plan generation to provide the system with an awareness of its surroundings, resulting in plans where each generated action is mapped to objects present in the scene. We also design a novel scoring function that, along with generating the action steps and associating them with objects, helps the system disambiguate among object instances and take into account their states. We evaluated our approach using the VirtualHome simulator and the ActivityPrograms knowledge base and found that action plans generated from our system had a 310% improvement in executability and a 147% improvement in correctness over prior work. The complete code and a demo of our method is publicly available at https://github.com/hri-ironlab/scene_aware_language_planner.
|
| |
| 14:48-14:54, Paper MoBT16.9 | Add to My Program |
| Interaction-Aware and Hierarchically-Explainable Heterogeneous Graph-Based Imitation Learning for Autonomous Driving Simulation |
|
| Tabatabaie, Mahan | University of Connecticut |
| He, Suining | University of Connecticut |
| Shin, Kang G. | University of Michigan |
Keywords: Representation Learning, Learning from Demonstration, Imitation Learning
Abstract: Understanding and learning the actor-to-X interactions (AXIs), such as those between the focal vehicles (actor) and other traffic participants (e.g., other vehicles, pedestrians) as well as traffic environments (e.g., city/road map), is essential for development of a decision-making model and simulation of autonomous driving (AD). Existing practices on imitation learning (IL) for AD simulation, despite the advances in the model learnability, have not accounted for fusing and differentiating the heterogeneous AXIs in complex road environments. Furthermore, how to further explain the hierarchical structures within the complex AXIs remains largely under-explored. To overcome these challenges, we propose HGIL, an interaction-aware and hierarchically-explainable Heterogeneous Graph-based Imitation Learning approach for AD simulation. We have designed a novel heterogeneous interaction graph (HIG) to provide local and global representation as well as awareness of the AXIs. Integrating the HIG as the state embeddings, we have designed a hierarchically-explainable generative adversarial imitation learning approach, with local sub-graph and global cross-graph attention, to capture the interaction behaviors and driving decision-making processes. Our data-driven simulation and explanation studies have corroborated the accuracy and explainability of HGIL in learning and capturing the complex AXIs.
|
| |
| 14:54-15:00, Paper MoBT16.10 | Add to My Program |
| Zero-Shot Fault Detection for Manipulators through Bayesian Inverse Reinforcement Learning |
|
| Zhao, Hanqing | McGill University |
| Liu, Xue | McGill University |
| Dudek, Gregory | McGill University |
Keywords: Failure Detection and Recovery, Learning from Experience, Robust/Adaptive Control
Abstract: We consider the detection of faults in robotic manipulators, with particular emphasis on faults that have not been observed or identified in advance, which naturally includes those that occur very infrequently. Recent studies indicate that the reward function obtained through Inverse Reinforcement Learning (IRL) can help detect anomalies caused by faults in a control system (i.e. fault detection). Current IRL methods for fault detection, however, either use a linear reward representation or require extensive sampling from the environment to estimate the policy, rendering them inappropriate for safety-critical situations where sampling of failure observations via fault injection can be expensive and dangerous. To address this issue, this paper proposes a zero-shot and exogenous fault detector based on an approximate variational reward imitation learning (AVRIL) structure. The fault detector recovers a reward signal as a function of externally observable information to describe the normal operation, which can then be used to detect anomalies caused by faults. Our method incorporates expert knowledge through a customizable reward prior distribution, allowing the fault detector to learn the reward solely from normal operation samples, without the need for a simulator or costly interactions with the environment. We evaluate our approach for exogenous partial fault detection in multi-stage robotic manipulator tasks, comparing it with several baseline methods. The results demonstrate that our method more effectively identifies unseen faults even when they occur within just three controller time steps.
|
| |
| 15:00-15:06, Paper MoBT16.11 | Add to My Program |
| Chat with the Environment: Interactive Multimodal Perception Using Large Language Models |
|
| Zhao, Xufeng | Universit�t Hamburg |
| Li, Mengdi | University of Hamburg |
| Weber, Cornelius | Knowledge Technology Group, University of Hamburg |
| Hafez, Muhammad Burhan | University of Hamburg |
| Wermter, Stefan | University of Hamburg |
Keywords: AI-Enabled Robotics, Multi-Modal Perception for HRI, AI-Based Methods
Abstract: Programming robot behavior in a complex world faces challenges on multiple levels, from dextrous low-level skills to high-level planning and reasoning. Recent pre-trained Large Language Models (LLMs) have shown remarkable reasoning ability in few-shot robotic planning. However, it remains challenging to ground LLMs in multimodal sensory input and continuous action output, while enabling a robot to interact with its environment and acquire novel information as its policies unfold. We develop a robot interaction scenario with a partially observable state, which necessitates a robot to decide on a range of epistemic actions in order to sample sensory information among multiple modalities, before being able to execute the task correctly. An interactive perception framework is therefore proposed with an LLM as its backbone, whose ability is exploited to instruct epistemic actions and to reason over the resulting multimodal sensations (vision, sound, haptics, proprioception), as well as to plan an entire task execution based on the interactively acquired information. Our study demonstrates that LLMs can provide high-level planning and reasoning skills and control interactive robot behavior in a multimodal environment, while multimodal modules with the context of the environmental state help ground the LLMs and extend their processing ability. The project website can be found at https://matcha-model.github.io.
|
| |
| 15:06-15:12, Paper MoBT16.12 | Add to My Program |
| Reinforcement Learning for Robot Navigation with Adaptive Forward Simulation Time (AFST) in a Semi-Markov Model |
|
| Chen, Yu'an | University of Science and Technology of China |
| Ruosong, Ye | University of Science and Technology of China |
| Tao, Ziyang | University of Science and Technology of China |
| Liu, Hongjian | University of Science and Technology of China |
| Chen, Guangda | NetEase |
| Peng, Jie | University of Science and Technology of China |
| Ma, Jun | University of Science and Technology of China |
| Zhang, Yu | University of Science and Technology of China |
| Ji, Jianmin | University of Science and Technology of China |
| Zhang, Yanyong | University of Science and Technology of China |
Keywords: Learning from Experience
Abstract: Deep reinforcement learning (DRL) algorithms have proven effective in robot navigation, especially in unknown environments, by directly mapping perception inputs into robot control commands. However, most existing methods ignore the local minimum problem in navigation and thereby cannot handle complex unknown environments. In this paper, we propose the first DRL-based navigation method modeled by a semi-Markov decision process (SMDP) with continuous action space, named Adaptive Forward Simulation Time (AFST), to overcome this problem. Specifically, we reduce the dimensions of the action space and improve the distributed proximal policy optimization (DPPO) algorithm for the specified SMDP problem by modifying its GAE to better estimate the policy gradient in SMDPs. Experiments in various unknown environments demonstrate the effectiveness of AFST.
|
| |
| 15:12-15:18, Paper MoBT16.13 | Add to My Program |
| A Hybrid Reinforcement Learning Approach with a Spiking Actor Network for Efficient Robotic Arm Target Reaching |
|
| Oikonomou, Katerina Maria | Democritus University of Thrace |
| Kansizoglou, Ioannis | Democritus University of Thrace |
| Gasteratos, Antonios | Democritus University of Thrace |
Keywords: Bioinspired Robot Learning, Reinforcement Learning, Mobile Manipulation
Abstract: The increasing demand for applications in competitive fields, such as assisted living and aerial robots, drives contemporary research into the development, implementation and integration of power-constrained solutions. Although, deep neural networks (DNNs) have achieved remarkable performances in many robotics applications, energy consumption remains a major limitation. The paper at hand proposes a hybrid variation of the well-established deep deterministic policy gradient (DDPG) reinforcement learning approach to train a 6 degree of freedom robotic arm in the target-reach task. In particular, we introduce a spiking neural network (SNN) for the actor model and a DNN for the critic one, aiming to find an optimal set of actions for the robot. The deep critic network is employed only during training and discarded afterwards, allowing the deployment of the SNN in neuromorphic hardware for inference. The agent is supported by a combination of RGB and laser scan data exploited for collision avoidance and object detection. We compare the hybrid-DDPG model against a classic DDPG one, demonstrating the superiority of our approach.
|
| |
| 15:18-15:24, Paper MoBT16.14 | Add to My Program |
| AR3n: A Reinforcement Learning-Based Assist-As-Needed Controller for Robotic Rehabilitation (I) |
|
| Pareek, Shrey | Cargill |
| Nisar, Harris | University of Illinois at Urbana Champaign |
| Kesavadas, Thenkurussi | University of Illinois at Urbana-Champaign |
Keywords: AI-Enabled Robotics, Rehabilitation Robotics, Reinforcement Learning
Abstract: In this paper, we present AR3n (pronounced as Aaron), an assist-as-needed (AAN) controller that utilizes reinforcement learning to supply adaptive assistance during a robot assisted handwriting rehabilitation task. AR3n uses the soft actor-critic reinforcement learning algorithm to derive a model-free controller for upper limb stroke rehabilitation. Unlike previous AAN controllers, our method does not require manual-tuning of controller parameters or the need for patient specific physical models. We propose the use of a virtual patient model to generalize AR3n across multiple subjects. The system modulates robotic impedance based on a subject's tracking error, while minimizing the amount of robotic assistance. It delivers stable realtime assistance and prevents over-reliance on robotic assistance. The controller is experimentally validated through a set of simulations and human subject experiments. We compare our system to traditional rule-based controllers and a Learning-from-Demonstration controller previously proposed by our group. Finally, we demonstrate the efficacy and superiority of AR3n over rule-based controllers through a human subject study.
|
| |
| MoBT17 Regular session, 330B |
Add to My Program |
| Learning from Demonstration |
|
| |
| Chair: Mou, Shaoshuai | Purdue University |
| Co-Chair: Bekiroglu, Yasemin | Chalmers University of Technology, University College London |
| |
| 14:00-14:06, Paper MoBT17.1 | Add to My Program |
| PACT: Perception-Action Causal Transformer for Autoregressive Robotics Pre-Training |
|
| Bonatti, Rogerio | Microsoft |
| Vemprala, Sai | Microsoft Corporation |
| Ma, Shuang | Microsoft |
| Vieira Frujeri, Felipe | Microsoft |
| Chen, Shuhang | Microsoft |
| Kapoor, Ashish | MicroSoft |
Keywords: Representation Learning, Learning from Demonstration, Transfer Learning
Abstract: Robotics has long been a field riddled with complex systems architectures whose modules and connections, whether traditional or learning-based, require significant human expertise and prior knowledge. Inspired by large pre-trained language models, this work introduces a paradigm for pre-training a general purpose representation that can serve as a starting point for multiple tasks on a given robot. We present the Perception-Action Causal Transformer (PACT), a generative transformer-based architecture that aims to build representations directly from robot data in a self-supervised fashion. Through autoregressive prediction of states and actions over time, our model implicitly encodes dynamics and behaviors for a particular robot. Our experimental evaluation focuses on the domain of mobile agents, where we show that this robot-specific representation can function as a single starting point to achieve distinct tasks such as safe navigation, localization and mapping. We evaluate two form factors: a wheeled robot that uses a LiDAR sensor as perception input (MuSHR), and a simulated agent that uses first-person RGB images (Habitat). We show that finetuning small task-specific networks on top of the larger pretrained model results in significantly better performance compared to training a single model from scratch for all tasks simultaneously, and comparable performance to training a separate large model for each task independently. By sharing a common good-quality representation across tasks we can lower overall model capacity and speed up the real-time deployment of such systems. Open-sourced code: https://github.com/microsoft/PACT Video: https://youtu.be/mNQvQu_atuw
|
| |
| 14:06-14:12, Paper MoBT17.2 | Add to My Program |
| Learning from Sparse Demonstrations (I) |
|
| Jin, Wanxin | Arizona State University |
| Murphey, Todd | Northwestern University |
| Kulic, Dana | Monash University |
| Ezer, Neta | Northrop Grumman Corporation |
| Mou, Shaoshuai | Purdue University |
Keywords: Learning from Demonstration, Optimization and Optimal Control, Motion and Path Planning, Inverse Reinforcement Learning
Abstract: This paper develops the Continuous Pontryagin Differentiable Programming (Continuous PDP) method that enables a robot to learn an objective function from a few number of sparsely demonstrated keyframes. The keyframes are few desired sequential outputs that a robot is wanted to follow at certain time steps. The time span of the keyframes can be different from that of the robot�s actual execution. The method jointly searches for an objective function and a time-warping function such that the robot�s resulting motion sequentially follows the keyframes with minimal discrepancy loss. Continuous PDP minimizes the discrepancy loss using projected gradient descent, by efficiently solving the gradient of robot motion with respect to the unknown parameters. The method is first evaluated on a simulated robot arm, and then applied to a 6-DoF maneuvering quadrotor to learn an objective function for motion planning in un-modeled environments. The results show the efficiency of the method, its ability to handle time misalignment between the keyframes and robot execution, and the generalization of objective learning into unseen motion conditions.
|
| |
| 14:12-14:18, Paper MoBT17.3 | Add to My Program |
| Neural Field Movement Primitives for Joint Modelling of Scenes and Motions |
|
| Tekden, Ahmet | Chalmers University of Technology |
| Deisenroth, Marc Peter | University College London |
| Bekiroglu, Yasemin | Chalmers University of Technology, University College London |
Keywords: Learning from Demonstration, Representation Learning, Deep Learning in Grasping and Manipulation
Abstract: This paper presents a novel Learning from Demonstration (LfD) method that uses neural fields to learn new skills efficiently and accurately. It achieves this by utilizing a shared embedding to learn both scene and motion representations in a generative way. Our method smoothly maps each expert demonstration to a scene-motion embedding and learns to model them without requiring hand-crafted task parameters or large datasets. It achieves data efficiency by enforcing scene and motion generation to be smooth with respect to changes in the embedding space. At inference time, our method can retrieve scene-motion embeddings using test time optimization, and generate precise motion trajectories for novel scenes. The proposed method is versatile and can employ images, 3D shapes, and any other scene representations that can be modeled using neural fields. Additionally, it can generate both end-effector positions and joint angle-based trajectories. Our method is evaluated on tasks that require accurate motion trajectory generation, where the underlying task parametrization is based on object positions and geometric scene changes. Experimental results demonstrate that the proposed method outperforms the baseline approaches and generalizes to novel scenes. Furthermore, in real-world experiments, we show that our method can successfully model multi-valued trajectories, it is robust to the distractor objects introduced at inference time, and it can generate 6D motions.
|
| |
| 14:18-14:24, Paper MoBT17.4 | Add to My Program |
| Augmentation Enables One-Shot Generalization in Learning from Demonstration for Contact-Rich Manipulation |
|
| Li, Xing | TU Berlin |
| Baum, Manuel | TU Berlin |
| Brock, Oliver | Technische Universit�t Berlin |
Keywords: Learning from Demonstration, Imitation Learning
Abstract: We introduce a Learning from Demonstration (LfD) approach for contact-rich manipulation tasks, i.e., tasks in which the manipulandum's motion is constrained by contact with the environment. Our approach is motivated by the insight that even a large number of demonstrations will often not contain sufficient information to obtain a general policy for the task. To obtain general policies, our approach emph{augments} the information contained in a single demonstration. This autonomous augmentation is based on the insight that environmental constraints play a central role in generalization. We validate our approach in real-world experiments with mechanisms with multiple, interdependent articulations, including latch locks, chain locks, and drawers with handles. The extracted policies, obtained from a single emph{augmented} human demonstration, generalize to different mechanisms of the same type and in varying environmental settings.
|
| |
| 14:24-14:30, Paper MoBT17.5 | Add to My Program |
| Using Single Demonstrations to Define Autonomous Manipulation Contact Tasks in Unstructured Environments Via Object Affordances |
|
| Regal, Frank | The University of Texas at Austin |
| Pettinger, Adam | The University of Texas at Austin |
| Duncan, John Alexander | The University of Texas at Austin |
| Parra, Fabian | University of Texas at Austin |
| Akita, Emmanuel | The University of Texas at Austin |
| Navarro, Alex | University of Texas at Austin |
| Pryor, Mitchell | University of Texas |
Keywords: Learning from Demonstration, Task and Motion Planning, Virtual Reality and Interfaces
Abstract: Performing a manipulation contact task in an unknown and unstructured environment is still a challenge. Learning from Demonstration (LfD) techniques provide an intuitive means to define difficult-to-model contact tasks, but have attributes that make them undesirable for novice users in uncertain environments. We present a novel end-to-end system that captures a single manipulation task demonstration from an augmented reality (AR) head-mounted display (HMD), computes an affordance primitive (AP) representation of the task, and sends the task parameters to a mobile manipulator for execution in real-time. Using an AR HMD for task demonstration and APs for task representation has several distinct advantages. AR task demonstration is intuitive, practical, and can be accomplished without requiring sensor installment in the task environment. APs provide a compact and legible task representation, enabling scalability, generalization, and modification of the task without significant data processing overhead. In this effort, we demonstrate system generalization with 10 object manipulation tasks, confirming the computed parameters from all tasks fit within AP tolerances. Secondly, we evaluate a mobile manipulator robot's ability to perform human-demonstrated tasks using AP representation. To increase robustness, we devised and tested four methods to correct for inherent, irreducible position errors in the system. A final study shows the system has a manipulation success rate of 96% from a single manipulation demonstration on an industrial wheel valve.
|
| |
| 14:30-14:36, Paper MoBT17.6 | Add to My Program |
| Constrained Dynamic Movement Primitives for Collision Avoidance in Novel Environments |
|
| Shaw, Seiji | Massachusetts Institute of Technology |
| Jha, Devesh | Mitsubishi Electric Research Laboratories |
| Raghunathan, Arvind | Mitsubishi Electric Research Laboratories |
| Corcodel, Radu Ioan | Mitsubishi Electric Research Laboratories |
| Romeres, Diego | Mitsubishi Electric Research Laboratories |
| Konidaris, George | Brown University |
| Nikovski, Daniel | MERL |
Keywords: Learning from Demonstration, Robot Safety, Collision Avoidance
Abstract: Dynamic movement primitives are widely used for learning skills that can be demonstrated to a robot by a skilled human or controller. While their generalization capabilities and simple formulation make them very appealing to use, they possess no strong guarantees to satisfy operational safety constraints for a task. We present constrained dynamic movement primitives (CDMPs), which can allow for positional constraint satisfaction in the robot workspace. Our method solves a non-linear optimization to perturb an existing DMP�s forcing weights to admit a Zeroing Barrier Function (ZBF), which certifies positional workspace constraint satisfaction. We demonstrate our approach under different positional constraints on the end-effector movement on multiple physical robots, such as obstacle avoidance and workspace limitations.
|
| |
| 14:36-14:42, Paper MoBT17.7 | Add to My Program |
| Learning Constraints on Autonomous Behaviorfrom Proactive Feedback |
|
| Basich, Connor | University of Massachusetts Amherst |
| Mahmud, Saaduddin | University of Massachusetts Amherst |
| Zilberstein, Shlomo | University of Massachusetts |
Keywords: Learning from Demonstration, AI-Based Methods, Reinforcement Learning
Abstract: Learning from feedback is a common paradigm to acquire information that is hard to specify a priori. In this work, we consider an agent with a known nominal reward model that captures its high-level task objective. Furthermore, the agent operates subject to constraints that are unknown a priori and must be inferred from human interventions. Unlike existing methods, our approach does not rely on full or partial demonstration trajectories or assume a fully reactive human. Instead, we assume access only to sparse interventions, which may in fact be generated proactively by the human, and we only make minimal assumptions about the human. We provide both theoretical bounds on performance and empirical validations of our method. We show that our method enables an agent to learn a constraint set with high accuracy that generalizes well to new environments within a domain, whereas methods that only consider reactive feedback learn an incorrect constraint set that does not generalize well, making constraint violations more likely in new environments.
|
| |
| 14:42-14:48, Paper MoBT17.8 | Add to My Program |
| Learning Models of Adversarial Agent Behavior under Partial Observability |
|
| Ye, Sean | Georgia Institute of Technology |
| Natarajan, Manisha | Georgia Institute of Technology |
| Wu, Zixuan | Georgia Institute of Technology |
| Paleja, Rohan | Georgia Institute of Technology |
| Chen, Letian | Georgia Institute of Technology |
| Gombolay, Matthew | Georgia Institute of Technology |
Keywords: Learning from Demonstration, Deep Learning Methods, Representation Learning
Abstract: The need for opponent modeling and tracking arises in several real-world scenarios, such as professional sports, video game design, and drug-trafficking interdiction. In this work, we present GRaph based Adversarial Modeling with Mutual Information (GrAMMI) for modeling the behavior of an adversarial opponent agent. GrAMMI is a novel graph neural network (GNN) based approach that uses mutual information maximization as an auxiliary objective to predict the current and future states of an adversarial opponent with partial observability. To evaluate GrAMMI, we design two large-scale, pursuit-evasion domains inspired by real-world scenarios, where a team of heterogeneous agents is tasked with tracking and interdicting a single adversarial agent, and the adversarial agent must evade detection while achieving its own objectives. With the mutual information formulation, GrAMMI outperforms all baselines in both domains and achieves 31.68% higher log-likelihood on average for future adversarial state predictions across both domains.
|
| |
| 14:48-14:54, Paper MoBT17.9 | Add to My Program |
| Robust Real-Time Motion Retargeting Via Neural Latent Prediction |
|
| Wang, Tiantian | Zhejiang University |
| Zhang, Haodong | Zhejiang University |
| Chen, Lu | Zhejiang University |
| Wang, Dongqi | Zhejiang University |
| Wang, Yue | Zhejiang University |
| Xiong, Rong | Zhejiang University |
Keywords: Learning from Demonstration, Imitation Learning, Dual Arm Manipulation
Abstract: Human-robot motion retargeting is a crucial approach for fast learning motion skills. Achieving real-time retargeting demands high levels of synchronization and accuracy. Even though existing retargeting methods have swift calculation, they still cause time-delay effect on the synchronous retargeting. To mitigate this issue, this paper proposes a motion retargeting method guided by prediction, which effectively reduces the adverse impact of time-delay. The proposed pipeline contains motion retargeting in spatial temporal graph-based structure and motion prediction in the latent space. The motion sequence retargeting builds mapping and paired data from human poses to corresponding robot configurations for training prediction model, and generated robot motion satisfies limit and self-collision constrains. The controller guided by prediction imports future robot joint motion to achieve advanced trajectory tracking, thereby compensating for delay time spent on calculation and tracking. Experimental results show that our method outperforms other methods in terms of synchronization and similarity. Furthermore, our method exhibits fault-tolerant capability in scenarios involving the loss of human information input.
|
| |
| 14:54-15:00, Paper MoBT17.10 | Add to My Program |
| Deep Probabilistic Movement Primitives with a Bayesian Aggregator |
|
| Przystupa, Michael | University of Alberta |
| Haghverd, Faezeh | University of Alberta |
| Jagersand, Martin | University of Alberta |
| Tosatto, Samuele | University of Innsbruck |
Keywords: Learning from Demonstration, Imitation Learning, Probabilistic Inference
Abstract: Movement primitives are trainable parametric models that reproduce robotic movements starting from a limited set of demonstrations. Previous works proposed simple linear models that exhibited high sample efficiency and generalization power by allowing temporal modulation of movements (reproducing movements faster or slower), blending (merging two movements into one), via-point conditioning (constraining a movement to meet some particular via-points) and context conditioning (generation of movements based on an observed variable, e.g., position of an object). Previous works have proposed neural network based motor primitive models, having demonstrated their capacity to perform task with some forms of input conditioning or time-modulation representations. However, there has not been a single unified deep motor primitive�s model proposed that is capable of all previous operations, limiting neural motor primitive�s potential applications. This paper proposes a deep movement primitive architecture that encodes all the operations above and uses a Bayesian context aggregator that allows a more sound context conditioning and blending. Our results demonstrate our approach can scale to reproduce complex motions on a larger variety of input choices compared to baselines while maintaining operations of linear movement primitives provide.
|
| |
| 15:00-15:06, Paper MoBT17.11 | Add to My Program |
| Self-Supervised Visual Motor Skills Via Neural Radiance Fields |
|
| Gesel, Paul | University of New Hampshire |
| Sojib, Noushad | University of New Hampshire |
| Begum, Momotaz | University of New Hampshire |
Keywords: Learning from Demonstration, Imitation Learning, Deep Learning in Grasping and Manipulation
Abstract: In this paper, we propose a novel network architecture for visual imitation learning that exploits neural radiance fields (NeRFs) and key-point correspondence for self-supervised visual motor policy learning. The proposed network architecture incorporates a dynamic system output layer for policy learning. Combining the stability and goal adaption properties of dynamic systems with the robustness of keypoint-based correspondence yields a policy that is invariant to significant clutter, occlusions, lighting conditions changes, and spatial variations in goal configurations. Experiments on multiple manipulation tasks show that our method outperforms comparable visual motor policy learning methods on both in-distribution and out-of-distribution scenarios when using a small number of training samples.
|
| |
| 15:06-15:12, Paper MoBT17.12 | Add to My Program |
| Autonomous Ultrasound Scanning towards Standard Plane Using Interval Interaction Probabilistic Movement Primitives |
|
| Hu, Yi | University of Alberta |
| Tavakoli, Mahdi | University of Alberta |
Keywords: Learning from Demonstration, Imitation Learning, Surgical Robotics: Planning
Abstract: Learning from demonstrations is the paradigm where robots acquire new skills demonstrated by an expert and alleviate the physical burden on experts to perform repetitive tasks. Ultrasound scanning is one of the ways to view the anatomical structures of soft tissues, but it is repetitive for some tissue scanning tasks. In this study, an autonomous ultrasound scanning towards a standard plane framework is proposed. Interaction probabilistic movement primitives (iProMP) was proposed for the collaborative tasks for human and robot movement. Inspired by the interval type-2 fuzzy system, an interval iProMP is proposed to learn the ultrasound scanning navigation strategy from scanning demonstrations and the collaborative agents are the robot movement and ultrasound image information. The proposed interval iProMP improves the capacity of dealing with uncertainties due to insufficient observations during reproduction. U-Net is applied to recognize the desired ultrasound image shown during demonstrations and a confidence map is used to evaluate the ultrasound image quality. Breast seroma scanning is chosen as the ultrasound scanning task to validate the performance of the proposed autonomous ultrasound scanning framework. Ultrasound navigation is to realize autonomous ultrasound scanning for localizing the breast seroma. The simulation comparison result shows the better performance of the proposed interval iProMP under insufficient observation, compared to traditional iProMP. The experiment result validates the feasibility and generality of the proposed autonomous ultrasound scanning framework using interval iProMP with a higher success rate than that with traditional iProMP.
|
| |
| 15:12-15:18, Paper MoBT17.13 | Add to My Program |
| Learning Continuous Grasping Function with a Dexterous Hand from Human Demonstrations |
|
| Ye, Jianglong | UC San Diego |
| Wang, Jiashun | Carnegie Mellon University |
| Huang, Binghao | University of California, San Diego |
| Qin, Yuzhe | UC San Diego |
| Wang, Xiaolong | UC San Diego |
Keywords: Learning from Demonstration, Dexterous Manipulation, Deep Learning in Grasping and Manipulation
Abstract: We propose to learn to generate grasping motion for manipulation with a dexterous hand using implicit functions. With continuous time inputs, the model can generate a continuous and smooth grasping plan. We name the proposed model Continuous Grasping Function (CGF). CGF is learned via generative modeling with a Conditional Variational Autoencoder using 3D human demonstrations. We will first convert the large-scale human-object interaction trajectories to robot demonstrations via motion retargeting, and then use these demonstrations to train CGF. During inference, we perform sampling with CGF to generate different grasping plans in the simulator and select the successful ones to transfer to the real robot. By training on diverse human data, our CGF allows generalization to manipulate multiple objects. Compared to previous planning algorithms, CGF is more efficient and achieves significant improvement on success rate when transferred to grasping with the real Allegro Hand. Our project page is available at https://jianglongye.com/cgf/ .
|
| |
| 15:18-15:24, Paper MoBT17.14 | Add to My Program |
| Robot Programming by Demonstration: Trajectory Learning Enhanced by sEMG-Based User Hand Stiffness Estimation (I) |
|
| Biagiotti, Luigi | University of Modena and Reggio Emilia |
| Meattini, Roberto | University of Bologna |
| Chiaravalli, Davide | Alma Mater Studiorum, University of Bologna |
| Palli, Gianluca | University of Bologna |
| Melchiorri, Claudio | University of Bologna |
Keywords: Learning from Demonstration, Motion and Path Planning, Physical Human-Robot Interaction, Control Architectures and Programming
Abstract: Trajectory learning is one of the key components of robot Programming by Demonstration (PdB) approaches, which in many cases, especially in industrial practice, aim at defining complex manipulation patterns. In order to enhance these methods, which are generally based on a physical interaction between the user and the robot, guided along the desired path, an additional input channel is considered in this work. The hand stiffness, that the operator continuously modulates during the demonstration, is estimated from the forearm surface electromyography (sEMG) and translated into a request for a higher or lower accuracy level. Then, a constrained optimization problem is built (and solved) in the framework of smoothing B-splines to obtain a minimum curvature trajectory approximating, in this manner, the taught path within the precision imposed by the user. Experimental tests in different applicative scenarios, involving both position and orientation, prove the benefits of the proposed approach in terms of intuitiveness of the programming procedure for the human operator and characteristics of the final motion.
|
| |
| MoBT18 Regular session, 331ABC |
Add to My Program |
| Human Detection and Pose |
|
| |
| Chair: Ghonasgi, Keya | The University of Texas at Austin |
| Co-Chair: Kim, H. Jin | Seoul National University |
| |
| 14:00-14:06, Paper MoBT18.1 | Add to My Program |
| Automated Key Action Detection for Closed Reduction of Pelvic Fractures by Expert Surgeons in Robot-Assisted Surgery |
|
| Pan, Ming Zhang | Guang Xi University |
| Deng, Ya-Wen | Guangxi University |
| Li, Zhen | Institute of Automation, Chinese Academy of Sciences |
| Chen, Yuan | Guangxi University |
| Liao, Xiao-Lan | Guangxi University |
| Bian, Gui-Bin | Institute of Automation, Chinese Academy of Sciences |
Keywords: Gesture, Posture and Facial Expressions, Intention Recognition
Abstract: Pelvic fractures are one of the most serious traumas in orthopedics, and the technical proficiency and expertise of the surgical team strongly influence the quality of reduction results. With the advancement of information technology and robotics, robot-assisted pelvic fracture reduction surgery is expected to reduce the impact caused by inexperienced doctors and improve the accuracy and stability of pelvic reduction. However, this requires the robot to detect key surgeon actions from time-series data, enabling the robot to independently perceive the surgical status, predict the surgeon's intentions, assess the demonstrated level of professional competence, and assess the progress of the surgery. Therefore, a multi-task deep learning neural network architecture is proposed, which incorporates Convolutional Neural Network- Bidirectional Long Short-Term Memory (CNN-BiLSTM) along with tri-modality fusion and feature extraction techniques. The proposed framework aims to achieve key action detection in closed reduction operations for pelvic fractures. Subsequently, a trimodal fine-grained dataset was constructed, wherein 29, 32, and 14 labels were marked on flexion, position, and pressure data for 14 key closed reduction actions. The experimental results show that the correct detection rate of closed reduction actions is 92.3%, significantly higher than the commonly used recognition algorithms. This work provides a method for the robot to learn the surgeon's professional knowledge, provides the basis for the operation's motion perception, and contributes to the autonomy of the robot-assisted closed reduction surgery of pelvic fractures.
|
| |
| 14:06-14:12, Paper MoBT18.2 | Add to My Program |
| LAMP: Leveraging Language Prompts for Multi-Person Pose Estimation |
|
| Hu, Shengnan | University of Central Florida |
| Zheng, Ce | University of Central Florida |
| Zhou, Zixiang | University of Central Florida |
| Chen, Chen | University of Central Florida |
| Sukthankar, Gita | University of Central Florida |
Keywords: Gesture, Posture and Facial Expressions, Deep Learning for Visual Perception, Human Detection and Tracking
Abstract: Human-centric visual understanding is an important desideratum for effective human-robot interaction. In order to navigate crowded public places, social robots must be able to interpret the activity of the surrounding humans. This paper addresses one key aspect of human-centric visual understanding, multi-person pose estimation. Achieving good performance on multi-person pose estimation in crowded scenes is difficult due to the challenges of occluded joints and instance separation. In order to tackle these challenges and overcome the limitations of image features in representing invisible body parts, we propose a novel prompt-based pose inference strategy called LAMP, Language Assisted Multi-person Pose estimation. By utilizing the text representations generated by a well-trained language model (CLIP), LAMP can facilitate the understanding of poses on the instance and joint levels, and learn more robust visual representations that are less susceptible to occlusion. This paper demonstrates that language-supervised training boosts the performance of single-stage multi-person pose estimation, and both instance-level and joint-level prompts are valuable for training. The code is available at https://github.com/shengnanh20/LAMP.
|
| |
| 14:12-14:18, Paper MoBT18.3 | Add to My Program |
| Detecting Changes in Functional State: A Comparative Analysis Using Wearable Sensors and a Sensorized Tip |
|
| Otamendi, Janire | University of the Basque Country UPV/EHU |
| Zubizarreta, Asier | University of the Basque Country (UPV/EHU) |
Keywords: Medical Robots and Systems
Abstract: Gait analysis can provide relevant information about the physical and neurological conditions of individuals. For this reason, several studies have recently been carried out in an attempt to monitor people's gait and automatically detect gait anomalies. Among the various monitoring systems available for gait analysis, wearable sensors are considered the gold standard due to their wide capture range and low cost. However, in the case of people that require assistive devices for walking, some studies have proposed the use of sensorized devices in order to minimize invasiveness. Nevertheless, there is still a lack of comparative works that evaluate the performance of sensorized assistive devices for walking with widely used wearable sensors. Hence, this paper presents a comparison between the performance of accelerometer-based wearable sensors and a sensorized tip developed by the authors to detect gait anomalies. The comparative study has been carried out in a controlled environment with five healthy subjects, in which three different physical states have been simulated. A machine-learning based anomaly detection approach has been implemented based on the data captured by a set of wearable sensors and the sensorized tip, and the overall performance of both monitoring systems has been evaluated. Results show that even if both devices can provide an average accuracy of more than 80% in gait anomaly detection, the sensorized tip provides better performance.
|
| |
| 14:18-14:24, Paper MoBT18.4 | Add to My Program |
| DiffuPose: Monocular 3D Human Pose Estimation Via Denoising Diffusion Probabilistic Model |
|
| Choi, Jeongjun | Seoul National University |
| Shim, Dongseok | Seoul National University |
| Kim, H. Jin | Seoul National University |
Keywords: Human Detection and Tracking, Visual Learning, Deep Learning Methods
Abstract: Thanks to the development of 2D keypoint detectors, monocular 3D human pose estimation (HPE) via 2D-to-3D uplifting approaches have achieved remarkable improvements. Still, monocular 3D HPE is a challenging problem due to the inherent depth ambiguities and occlusions. To handle this problem, many previous works exploit temporal information to mitigate such difficulties. However, there are many real-world applications where frame sequences are not accessible. This paper focuses on reconstructing a 3D pose from a single 2D keypoint detection. Rather than exploiting temporal information, we alleviate the depth ambiguity by generating multiple 3D pose candidates which can be mapped to an identical 2D keypoint. We build a novel diffusion-based framework to effectively sample diverse 3D poses from an off-the-shelf 2D detector. By considering the correlation between human joints by replacing the conventional denoising U-Net with graph convolutional network, our approach accomplishes further performance improvements. We evaluate our method on the widely adopted Human3.6M and HumanEva-I datasets. Comprehensive experiments are conducted to prove the efficacy of the proposed method, and they confirm that our model outperforms state-of-the-art multi-hypothesis 3D HPE methods.
|
| |
| 14:24-14:30, Paper MoBT18.5 | Add to My Program |
| BodySLAM++: Fast and Tightly-Coupled Visual-Inertial Camera and Human Motion Tracking |
|
| Henning, Dorian Fritz | Imperial College London |
| Choi, Christopher | Imperial College London |
| Schaefer, Simon | Technical University of Munich |
| Leutenegger, Stefan | Technical University of Munich |
Keywords: Human Detection and Tracking, Modeling and Simulating Humans, Visual-Inertial SLAM
Abstract: Robust, fast, and accurate human state -- 6D pose, shape, and posture -- estimation remains a challenging problem. For real-world applications, the ability to estimate the human state in real-time is highly desirable. In this paper, we present BodySLAM++, a fast, efficient, and accurate human and camera state estimation framework relying on visual-inertial data. BodySLAM++ extends an existing visual-inertial state estimation framework, OKVIS2, to solve the dual task of estimating camera and human states simultaneously. Our system improves the accuracy of both human and camera state estimation with respect to baseline methods by 26% and 12%, respectively, and achieves real-time performance at 15+ frames per second on an Intel i7-model CPU. Experiments were conducted on a custom dataset containing both ground truth human and camera poses collected with an indoor motion tracking system.
|
| |
| 14:30-14:36, Paper MoBT18.6 | Add to My Program |
| Characterizing the Onset and Offset of Motor Imagery During Passive Arm Movements Induced by an Upper-Body Exoskeleton |
|
| Mitra, Kanishka | The University of Texas at Austin |
| Racz, Frigyes Samuel | The University of Texas at Austin |
| Kumar, Satyam | The University of Texas at Austin |
| Deshpande, Ashish | The University of Texas |
| Mill�n, Jos� del R. | The University of Texas at Austin |
Keywords: Brain-Machine Interfaces, Rehabilitation Robotics, Prosthetics and Exoskeletons
Abstract: Two distinct technologies have gained attention lately due to their prospects for motor rehabilitation: robotics and brain-machine interfaces (BMIs). Harnessing their combined efforts is a largely uncharted and promising direction that has immense clinical potential. However, a significant challenge is whether motor intentions from the user can be accurately detected using non-invasive BMIs in the presence of instrumental noise and passive movements induced by the rehabilitation exoskeleton. As an alternative to the straightforward continuous control approach, this study instead aims to characterize the onset and offset of motor imagery during passive arm movements induced by an upper-body exoskeleton to allow for the natural control (initiation and termination) of functional movements. Ten participants were recruited to perform kinesthetic motor imagery (MI) of the right arm while attached to the robot, simultaneously cued with LED indicating the initiation and termination of a goal-oriented reaching task. Using electroencephalogram signals, we built a decoder to detect the transition between i) rest and beginning MI and ii) maintaining and ending MI. Offline decoder evaluation achieved group average onset accuracy of 60.7% and 66.6% for offset accuracy, revealing that the start and stop of MI could be identified while attached to the robot. Furthermore, pseudo-online evaluation could replicate this performance, forecasting reliable online exoskeleton control in the future. Our approach showed that participants could produce quality and reliable sensorimotor rhythms regardless of noise or passive arm movements induced by wearing the exoskeleton, which opens new possibilities for BMI control of assistive devices.
|
| |
| 14:36-14:42, Paper MoBT18.7 | Add to My Program |
| CLiFF-LHMP: Using Spatial Dynamics Patterns for Long-Term Human Motion Prediction |
|
| Zhu, Yufei | �rebro University |
| Rudenko, Andrey | Robert Bosch GmbH |
| Kucner, Tomasz Piotr | Aalto University |
| Palmieri, Luigi | Robert Bosch GmbH |
| Arras, Kai Oliver | Bosch Research |
| Lilienthal, Achim J. | Orebro University |
| Magnusson, Martin | �rebro University |
Keywords: Human Detection and Tracking
Abstract: Human motion prediction is important for mobile service robots and intelligent vehicles to operate safely and smoothly around people. The more accurate predictions are, particularly over extended periods of time, the better a system can, e.g., assess collision risks and plan ahead. In this paper, we propose to exploit maps of dynamics (MoDs, a class of general representations of place-dependent spatial motion patterns, learned from prior observations) for long-term human motion prediction (LHMP). We present a new MoD-informed human motion prediction approach, named CLiFF-LHMP, which is data efficient, explainable, and insensitive to errors from an upstream tracking system. Our approach uses CLiFF-map, a specific MoD trained with human motion data recorded in the same environment. We bias a constant velocity prediction with samples from the CLiFF-map to generate multi-modal trajectory predictions. In two public datasets we show that this algorithm outperforms the state of the art for predictions over very extended periods of time, achieving 45% more accurate prediction performance at 50s compared to the baseline.
|
| |
| 14:42-14:48, Paper MoBT18.8 | Add to My Program |
| GloPro: Globally-Consistent Uncertainty-Aware 3D Human Pose Estimation & Tracking in the Wild |
|
| Schaefer, Simon | Technical University of Munich |
| Henning, Dorian Fritz | Imperial College London |
| Leutenegger, Stefan | Technical University of Munich |
Keywords: Modeling and Simulating Humans, Human and Humanoid Motion Analysis and Synthesis
Abstract: An accurate and uncertainty-aware 3D human body pose estimation is key to enabling truly safe but efficient human-robot interactions. Current uncertainty-aware methods in 3D human pose estimation are limited to predicting the uncertainty of the body posture, while effectively neglecting the body shape and root pose. In this work, we present GloPro, which to the best of our knowledge the first framework to predict an uncertainty distribution of a 3D body mesh including its shape, pose, and root pose, by efficiently fusing visual clues with a learned motion model. We demonstrate that it vastly outperforms state-of-the-art methods in terms of human trajectory accuracy in a world coordinate system (even in the presence of severe occlusions), yields consistent uncertainty distributions, and can run in real-time.
|
| |
| 14:48-14:54, Paper MoBT18.9 | Add to My Program |
| Anytime, Anywhere: Human Arm Pose from Smartwatch Data for Ubiquitous Robot Control and Teleoperation |
|
| Weigend, Fabian Clemens | Arizona State University |
| Sonawani, Shubham | Arizona State University |
| Michael, Drolet | Arizona State University |
| Ben Amor, Heni | Arizona State University |
Keywords: Multi-Modal Perception for HRI, Telerobotics and Teleoperation, Wearable Robotics
Abstract: This work devises an optimized machine learning approach for human arm pose estimation from a single smartwatch. Our approach results in a distribution of possible wrist and elbow positions, which allows for a measure of uncertainty and the detection of multiple possible arm posture solutions, i.e., multimodal pose distributions. Combining estimated arm postures with speech recognition, we turn the smartwatch into a ubiquitous, low-cost and versatile robot control interface. We demonstrate in two use-cases that this intuitive control interface enables users to swiftly intervene in robot behavior, to temporarily adjust their goal, or to train completely new control policies by imitation. Extensive experiments show that the approach results in a 40% reduction in prediction error over the current state-of-the-art and achieves a mean error of 2.56 cm for wrist and elbow positions.
|
| |
| 14:54-15:00, Paper MoBT18.10 | Add to My Program |
| Recognizing Real-World Intentions Using a Multimodal Deep Learning Approach with Spatial-Temporal Graph Convolutional Networks |
|
| Shi, Jiaqi | Osaka University, RIKEN |
| Liu, Chaoran | Riken |
| Ishi, Carlos Toshinori | RIKEN |
| Wu, Bowen | Osaka University; RIKEN |
| Ishiguro, Hiroshi | Osaka University |
Keywords: Intention Recognition, Deep Learning Methods, AI-Based Methods
Abstract: Identifying intentions is a critical task for comprehending the actions of others, anticipating their future behavior, and making informed decisions. However, it is challenging to recognize intentions due to the uncertainty of future human activities and the complex influence factors. In this work, we explore the method of recognizing intentions alluded under human behaviors in the real world, aiming to boost intelligent systems' ability to recognize potential intentions and understand human behaviors. We collect data containing real-world human behaviors before using a hand dispenser and a temperature scanner at the building entrance. These data are processed and labeled into intention categories. A questionnaire is conducted to survey the human ability in inferring the intentions of others. Skeleton data and image features are extracted inspired by the answer to the questionnaire. For skeleton-based intention recognition, we propose a spatial-temporal graph convolutional network that performs graph convolutions on both part-based graphs and adaptive graphs, which achieves the best performance compared with baseline models in the same task. A deep-learning-based method using multimodal features is proposed to automatically infer intentions, which is demonstrated to accurately predict intentions based on past behaviors in the experiment, significantly outperforming humans.
|
| |
| 15:00-15:06, Paper MoBT18.11 | Add to My Program |
| VADER: Vector-Quantized Generative Adversarial Network for Motion Prediction |
|
| Yasar, Mohammad | University of Virginia |
| Iqbal, Tariq | University of Virginia |
Keywords: Intention Recognition, Human Detection and Tracking, Human-Robot Teaming
Abstract: Human motion prediction is an essential component for enabling close-proximity human-robot collaboration. The task of accurately predicting human motion is non-trivial and is compounded by the variability of human motion and the presence of multiple humans in proximity. To address some of the open challenges in motion prediction, in this work, we propose VADER, a novel sequence learning algorithm that models past observed poses using a flexible discrete latent space. VADER introduces the concept of Vector Quantization for human motion prediction, enabling the learning of a discrete latent space without being restricted by any static prior. In addition, we propose a new objective function that uses the discriminator objective to penalize deviation of predicted motion from the ground-truth. Finally, to explicitly model interaction in multiple humans, we introduce a lightweight attention mechanism to condition per-agent prediction on the previous hidden states of all the agents. Our evaluation across three scenarios: single-agent, multi-agent, and human-robot collaboration shows that VADER outperformed all the state-of-the-art approaches, resulting in more feasible human poses that align better with the ground-truth. Finally, we conducted extensive ablation studies to emphasize the importance of the proposed modules.
|
| |
| 15:06-15:12, Paper MoBT18.12 | Add to My Program |
| SG-LSTM: Social Group LSTM for Robot Navigation through Dense Crowds |
|
| Bhaskara, Rashmi | Purdue University |
| Chiu, Maurice | Purdue University |
| Bera, Aniket | Purdue University |
Keywords: Human Detection and Tracking, Datasets for Human Motion
Abstract: As personal robots become increasingly accessible and affordable, their applications extend beyond large corporate warehouses and factories to operate in diverse, less controlled environments, where they interact with larger groups of people. In such contexts, ensuring not only safety and efficiency but also mitigating potential adverse psychological impacts on humans and adhering to unwritten social norms become paramount. In this research, we aim to address these challenges by developing a cutting-edge model capable of predicting pedestrian movements and interactions in crowded environments. To this end, we propose a novel approach called the Social Group Long Short-term Memory (SG-LSTM) model, which effectively captures the complexities of human group behavior and interactions within dense surroundings. By integrating social awareness into the LSTM architecture, our model achieves significantly enhanced trajectory predictions. The implementation of our SG-LSTM model empowers navigation algorithms to compute collision-free paths faster and with higher accuracy, particularly in complex and crowded scenarios. To foster further advancements in social navigation research, we contribute a substantial video dataset comprising labeled pedestrian groups, which we release to the broader research community. To thoroughly evaluate the performance of our approach, we conduct extensive experiments on multiple datasets, including ETH, Hotel, and MOT15. We compare various prediction approaches, such as LIN, LSTM, O-LSTM, and S-LSTM, and rigorously assess runtime performance.
|
| |
| 15:12-15:18, Paper MoBT18.13 | Add to My Program |
| SmartMocap: Joint Estimation of Human and Camera Motion Using Uncalibrated RGB Cameras |
|
| Saini, Nitin | Max Planck Institute for Intelligent Systems |
| Huang, Chun-Hao Paul | Max Planck Institute for Intelligent Systems, T�bingen |
| Black, Michael | Max Planck Institute for Intelligent Systems in T�bingen |
| Ahmad, Aamir | University of Stuttgart |
Keywords: Gesture, Posture and Facial Expressions, Human Detection and Tracking, Deep Learning for Visual Perception
Abstract: Markerless human motion capture (mocap) from multiple RGB cameras is a widely studied problem. Existing methods either need calibrated cameras or calibrate them relative to a static camera, which acts as the reference frame for the mocap system. The calibration step has to be done a priori for every capture session, which is a tedious process, and re-calibration is required whenever cameras are intentionally or accidentally moved. In this paper, we propose a mocap method which uses multiple static and moving extrinsically uncalibrated RGB cameras. The key components of our method are as follows. First, since the cameras and the subject can move freely, we select the ground plane as a common reference to represent both the body and the camera motions unlike existing methods which represent bodies in the camera coordinate. Second, we learn a probability distribution of short human motion sequences (~1sec) relative to the ground plane and leverage it to disambiguate between the camera and human motion. Third, we use this distribution as a motion prior in a novel multi-stage optimization approach to fit the SMPL human body model and the camera poses to the human body keypoints on the images. Finally, we show that our method can work on a variety of datasets ranging from aerial cameras to smartphones. It also gives more accurate results compared to the state-of-the-art on the task of monocular human mocap with a static camera. A video demo is available at https://tinyurl.com/yeykrb67 and our code is available at https://tinyurl.com/2p9rme9y .
|
| |
| MoBT19 Regular session, 360 Ambassador Ballroom |
Add to My Program |
| Deep Learning Methods II |
|
| |
| Chair: Grandia, Ruben | Disney Research |
| Co-Chair: Kelly, Stephen | McMaster University |
| |
| 14:00-14:06, Paper MoBT19.1 | Add to My Program |
| Online Continual Learning for Robust Indoor Object Recognition |
|
| Michieli, Umberto | Samsung Research |
| Ozay, Mete | Samsung Research |
Keywords: Continual Learning, Incremental Learning, Learning Categories and Concepts
Abstract: Vision systems mounted on home robots need to interact with unseen classes in changing environments. Robots have limited computational resources, labelled data and storage capability. These requirements pose some unique challenges: models should adapt without forgetting past knowledge in a data- and parameter-efficient way. We characterize the problem as few-shot (FS) online continual learning (OCL), where robotic agents learn from a non-repeated stream of few-shot data updating only a few model parameters. Additionally, such models experience variable conditions at test time, where objects may appear in different poses (e.g., horizontal or vertical) and environments (e.g., day or night). To improve robustness of CL agents, we propose RobOCLe, which; 1) constructs an enriched feature space computing high order statistical moments from the embedded features of samples; and 2) computes similarity between high order statistics of the samples on the enriched feature space, and predicts their class labels. We evaluate robustness of CL models to train/test augmentations in various cases. We show that different moments allow RobOCLe to capture different properties of deformations, providing higher robustness with no decrease of inference speed.
|
| |
| 14:06-14:12, Paper MoBT19.2 | Add to My Program |
| PaintNet: Unstructured Multi-Path Learning from 3D Point Clouds for Robotic Spray Painting |
|
| Tiboni, Gabriele | Politecnico Di Torino |
| Camoriano, Raffaello | Politecnico Di Torino |
| Tommasi, Tatiana | Politecnico Di Torino |
Keywords: Data Sets for Robot Learning, Deep Learning Methods, Computer Vision for Manufacturing
Abstract: Popular industrial robotic problems such as spray painting and welding require (i) conditioning on free-shape 3D objects and (ii) planning of multiple trajectories to solve the task. Yet, existing solutions make strong assumptions on the form of input surfaces and the nature of output paths, resulting in limited approaches unable to cope with real-data variability. By leveraging on recent advances in 3D deep learning, we introduce a novel framework capable of dealing with arbitrary 3D surfaces, and handling a variable number of unordered output paths (i.e. unstructured). Our approach predicts local path segments, which can be later concatenated to reconstruct long-horizon paths. We extensively validate the proposed method in the context of robotic spray painting by releasing PaintNet, the first public dataset of expert demonstrations on free-shape 3D objects collected in a real industrial scenario. A thorough experimental analysis demonstrates the capabilities of our model to promptly predict smooth output paths that cover up to 95% of previously unseen object surfaces, even without explicitly optimizing for paint coverage.
|
| |
| 14:12-14:18, Paper MoBT19.3 | Add to My Program |
| Switching Head-Tail Funnel UNITER for Dual Referring Expression Comprehension with Fetch-And-Carry Tasks |
|
| Korekata, Ryosuke | Keio University |
| Kambara, Motonari | Keio University |
| Yoshida, Yu | Keio University |
| Ishikawa, Shintaro | Keio University |
| Kawasaki, Yosuke | Keio University |
| Takahashi, Masaki | Keio University |
| Sugiura, Komei | Keio University |
Keywords: Deep Learning for Visual Perception, Deep Learning Methods, AI-Enabled Robotics
Abstract: This paper describes a domestic service robot (DSR) that fetches everyday objects and carries them to specified destinations according to free-form natural language instructions. Given an instruction such as �Move the bottle on the left side of the plate to the empty chair,� the DSR is expected to identify the bottle and the chair from multiple candidates in the environment and carry the target object to the destination. Most of the existing multimodal language understanding methods are impractical in terms of computational complexity because they require inferences for all combinations of target object candidates and destination candidates. We propose Switching Head�Tail Funnel UNITER, which solves the task by predicting the target object and the destination individually using a single model. Our method is validated on a dataset based on a standard dataset for Vision-and-Language Navigation with object manipulation tasks. The results show that our method outperforms the baseline method in terms of language comprehension accuracy. Furthermore, we conduct physical experiments in which a DSR delivers standardized everyday objects in a standardized domestic environment as requested by instructions with referring expressions. The experimental results show that the object grasping and placing actions are achieved with success rates of more than 90%.
|
| |
| 14:18-14:24, Paper MoBT19.4 | Add to My Program |
| FeatDANet: Feature-Level Domain Adaptation Network for Semantic Segmentation |
|
| Li, Jiao | Shanghai Institute of Microsystem and Information Technology |
| Shi, Wenjun | Shanghai Institute of Microsystem and Information Technology |
| Zhu, Dongchen | Shanghai Institute of Microsystem and Information Technology, Chi |
| Zhang, Guanghui | Shanghai Institute of Microsystem and Information Technology, Ch |
| Zhang, Xiaolin | Shanghai Institute of Microsystem and Information Technology, Chi |
| Li, Jiamao | Shanghai Institute of Microsystem and Information Technology, Chi |
Keywords: Transfer Learning, Object Detection, Segmentation and Categorization, Deep Learning Methods
Abstract: Unsupervised domain adaptation(UDA) is proposed to better adapt the network trained on labeled synthetic data to unlabeled real-world data for addressing the annotation cost. However, most of these methods pay more attention to domain distributions in input and output stages while ignoring the important differences in semantic expressions and local details in middle feature stages. Therefore, a novel UDA network named FeatDANet is presented to align feature-level domain distributions at each encoder layer. Specifically, two attention-based modules abbreviated as IFAM and DFLM are designed and implemented by mixing queries and keys between domains for advisable domain adaptation. The former realizes Inter-domain Features Alignment by transferring feature style, and the latter achieves Domain-invariant Features Learning robustly for the domain shift. Furthermore, FeatDANet is constructed as a self-training network with three weight-sharing branches, and an improved pseudo-labels learning strategy is suggested by identifying more confident pseudo-labels and maximizing the use of pseudo-labels. It increases the participation of unlabeled data and also ensures stability in training. Extensive experiments show that FeatDANet achieves state-of-the-art performances on the tasks of GTA-to-Cityscapes and Synthia-to-Cityscapes.
|
| |
| 14:24-14:30, Paper MoBT19.5 | Add to My Program |
| BlinkFlow: A Dataset to Push the Limits of Event-Based Optical Flow Estimation |
|
| Li, Yijin | Zhejiang University |
| Huang, Zhaoyang | The Chinese University of Hong Kong |
| Chen, Shuo | Zhejiang University |
| Shi, Xiaoyu | The Chinese University of Hong Kong |
| Li, Hongsheng | Chinese University of Hong Kong |
| Bao, Hujun | Zhejiang University |
| Cui, Zhaopeng | Zhejiang University |
| Zhang, Guofeng | Zhejiang University |
Keywords: Deep Learning for Visual Perception, Deep Learning Methods
Abstract: Event cameras provide high temporal precision, low data rates, and high dynamic range visual perception, which are well-suited for optical flow estimation. While data-driven optical flow estimation has obtained great success in RGB cameras, its generalization performance is seriously hindered in event cameras mainly due to the limited and biased training data. In this paper, we present a novel simulator, BlinkSim, for the fast generation of large-scale data for event-based optical flow. BlinkSim incorporates a configurable rendering engine alongside an event simulation suite. By leveraging the wealth of current 3D assets, the rendering engine enables us to automatically build up thousands of scenes with different objects, textures, and motion patterns and render very high-frequency images for realistic event data simulation. Based on BlinkSim, we construct a large training dataset and evaluation benchmark BlinkFlow that contains sufficient, diversiform, and challenging event data with optical flow ground truth. Experiments show that BlinkFlow improves the generalization performance of state-of-the-art methods by more than 40% on average and up to 90%. Moreover, we further propose an Event-based optical Flow transFormer (E-FlowFormer) architecture. Powered by our BlinkFlow, E-FlowFormer outperforms the SOTA methods by up to 91% on the MVSEC dataset and 14% on the DSEC dataset and presents the best generalization performance. The source code and data are available at https://zju3dv.github.io/blinkflow/.
|
| |
| 14:30-14:36, Paper MoBT19.6 | Add to My Program |
| Discovering Symbolic Adaptation Algorithms from Scratch |
|
| Kelly, Stephen | McMaster University |
| Park, Daniel | Google |
| Song, Xingyou | Google Brain |
| McIntire, Mitchell | Google |
| Nashikkar, Pranav | Google |
| Guha, Ritam | Michigan State University |
| Banzhaf, Wolfgang | Michigan State University |
| Deb, Kalyanmoy | Michigan State |
| Boddeti, Vishnu | Michigan State University |
| Tan, Jie | Google |
| Real, Esteban | Google |
Keywords: Evolutionary Robotics, Optimization and Optimal Control, Deep Learning Methods
Abstract: Autonomous robots deployed in the real world will need control policies that rapidly adapt to environmental changes. To this end, we propose AutoRobotics-Zero (ARZ), a method based on AutoML-Zero that discovers zero-shot adaptable policies from scratch. In contrast to neural network adaption policies, where only model parameters are optimized, ARZ can build control algorithms with the full expressive power of a linear register machine. We evolve modular policies that tune their model parameters and alter their inference algorithm on-the-fly to adapt to sudden environmental changes. We demonstrate our method on a realistic simulated quadruped robot, for which we evolve safe control policies that avoid falling when individual limbs suddenly break. This is a challenging task in which two popular neural network baselines fail. To evolve safe control policies that avoid falling, we leverage multi-objective search to simultaneously optimize forward motion gaits and stability. Finally, we conduct a detailed analysis of our method on a novel and challenging non-stationary control task dubbed Cataclysmic Cartpole. Results confirm our findings that ARZ is significantly more robust to sudden environmental changes and can build simple, interpretable control policies.
|
| |
| 14:36-14:42, Paper MoBT19.7 | Add to My Program |
| Visual Pre-Training for Navigation: What Can We Learn from Noise? |
|
| Wang, Yanwei | MIT |
| Ko, Ching-Yun | MIT |
| Agrawal, Pulkit | MIT |
Keywords: Representation Learning, Vision-Based Navigation, Deep Learning for Visual Perception
Abstract: One powerful paradigm in visual navigation is to predict actions from observations directly. Training such an end-to-end system allows representations useful for downstream tasks to emerge automatically. However, the lack of inductive bias makes this system data inefficient. We hypothesize a sufficient representation of the current view and the goal view for a navigation policy can be learned by predicting the location and size of a crop of the current view that corresponds to the goal. We further show that training such random crop prediction in a self-supervised fashion purely on synthetic noise images transfers well to natural home images. The learned representation can then be bootstrapped to learn a navigation policy efficiently with little interaction data.
|
| |
| 14:42-14:48, Paper MoBT19.8 | Add to My Program |
| Spatio-Temporal Attention Network for Persistent Monitoring of Multiple Mobile Targets |
|
| Wang, Yizhuo | National University of Singapore |
| Wang, Yutong | National University of Singapore |
| Cao, Yuhong | National University of Singapore |
| Sartoretti, Guillaume Adrien | National University of Singapore (NUS) |
Keywords: Deep Learning Methods, Motion and Path Planning, Surveillance Robotic Systems
Abstract: This work focuses on the persistent monitoring problem, where a set of targets moving based on an unknown model must be monitored by an autonomous mobile robot with a limited sensing range. To keep each target's position estimate as accurate as possible, the robot needs to adaptively plan its path to (re-)visit all the targets and update its belief from measurements collected along the way. In doing so, the main challenge is to strike a balance between exploitation, i.e., re-visiting previously-located targets, and exploration, i.e., finding new targets or re-acquiring lost ones. Encouraged by recent advances in deep reinforcement learning, we introduce an attention-based neural solution to the persistent monitoring problem, where the agent can learn the inter-dependencies between targets, i.e., their spatial and temporal correlations, conditioned on past measurements. This endows the agent with the ability to determine which target, time, and location to attend to across multiple scales, which we show also helps relax the usual limitations of a finite target set with prior positional information. We experimentally demonstrate that our method outperforms other baselines in terms of number of targets visits and average estimation error in complex environments. Finally, we implement and validate our model in a drone-based simulation experiment to monitor mobile ground targets in a high-fidelity simulator.
|
| |
| 14:48-14:54, Paper MoBT19.9 | Add to My Program |
| Subtask Aware End-To-End Learning for Visual Room Rearrangement |
|
| Kim, Youngho | KAIST (Korea Advanced Institute of Science and Technology) |
| Kim, Jong-Hwan | KAIST |
Keywords: Deep Learning Methods, Perception-Action Coupling, Long term Interaction
Abstract: The goal of intelligent embodied agents is to learn how to explore within the environment, interact with objects, and understand the environment in order to achieve task objectives. There are two main approaches to training such agents: one is to train an action policy that performs the task goal through end-to-end learning, and the other is to construct a policy by implementing the necessary abilities according to the task goal in a modular manner. For complex and long-horizon tasks, such as visual room rearrangement, a modular approach that infers task sequence by identifying the causality of actions through prior knowledge shows higher performance. Based on this insight, we propose an Online Subtask Prediction Network (OSPNet) that determines the subtask to be performed at each moment based on the environment information and past subtask inference history to train an embodied agent for long-horizon tasks through an end-to-end manner, and also propose a Subtask Aware Policy Network (SAPNet) as the action policy that decides actions based on the reasoning of the OSPNet. We implement an embodied agent that performs visual room rearrangement using the proposed SAPNet and train it through imitation learning, demonstrating similar or better performance with much fewer training steps than previous works.
|
| |
| 14:54-15:00, Paper MoBT19.10 | Add to My Program |
| Disentangling Crowds Interactions for Pedestrians Trajectory Prediction |
|
| Bhujel, Niraj | A*STAR |
| Yau, Wei-Yun | I2R |
Keywords: Deep Learning Methods, Human-Aware Motion Planning, Probabilistic Inference
Abstract: Predicting the future actions of multiple pedestrians is an essential feature for autonomous robots co-working in human crowded-environments. Estimating the unknown future path is a challenging problem due to the complex interactions occurring among pedestrians. Although recent developments in Graph Convolutional Network (GCN) allow for efficient encoding of such complex interactions, the encoded representations still lack the informative factors necessary to accurately predict their future behavior. To solve this, we introduce Disentangled GCN (DGCN) which aims to better capture the crowd interactions by decoupling the spatial and temporal factors. More specifically, we propose to encode the crowd interactions with two low-dimensional latent spaces: textit{spatial} latent and textit{temporal} latent, and decode the pedestrian's future behavior using the learned latents. We propose a novel regularizer function to train these latents in an unsupervised manner and condition the trajectory prediction on the learned latents using a spatially aware graph decoder. The proposed method is evaluated extensively on publicly available datasets consisting of pedestrians and vehicles. Our method improves mADE on ETH/UCY pedestrians dataset and achieves new state-of-the-art mFDE on nuScenes vehicle datasets.
|
| |
| 15:00-15:06, Paper MoBT19.11 | Add to My Program |
| EAAINet: An Element-Wise Attention Network with Global Affinity Information for Accurate Indoor Visual Localization |
|
| Dai, Kun | HIT |
| Xie, Tao | Harbin Institute of Technology |
| Wang, Ke | Harbin Institute of Technology |
| Jiang, Zhiqiang | Harbin Institute of Technology |
| Liu, Dedong | Harbin Institute of Technology |
| Li, Ruifeng | Harbin Institute of Technology |
| Wang, Jiahe | Harbin Institute of Technology |
Keywords: Deep Learning Methods, Transfer Learning, Deep Learning for Visual Perception
Abstract: Visual localization, a vital component of many visual applications, has been tackled by scene coordinates regression (SCoRe) methods that leverage neural networks to predict scene coordinates, followed by a PnP algorithm to recover camera pose. Nevertheless, these methods do not consider the relationship between image patches, known as relative features or affinity information, which can improve the capability of the network to perform complete scene parsing. Additionally, owing to the visual similarity between image patches, these methods are incapable of extracting reliable absolute features, resulting in inferior performance. In response, we propose EAAINet that is based on classical SCoRe-based approaches and consists of two novel modules: the Global Affinity Aggregation Module (GAAM) and the Element-wise Attention Module (EAM). Specifically, GAAM employs an interval sampling strategy to sample image patches to construct sparse graph neural networks (GNN), from which global affinity information between image patches is retrieved, hence ensuring precise scene parsing. EAM integrates multi-level features to generate reliable absolute features to regress accurate scene coordinates, with the key insight that the structure information is essential to differentiate similar image patches and the semantic information assists in modeling regression problems. Technically, EAM predicts element-wise soft attention masks to reconcile multi-level feature maps, enabling efficient feature fusion. Positional encoding and uncertainty modeling are also employed to enhance visual localization performance. Our proposed GAAM and EAM are designed as generic modules that can be assembled into modern SCoRe-based networks to boost performance. Experimental results show
|
| |
| 15:06-15:12, Paper MoBT19.12 | Add to My Program |
| Transformer-Based Neural Augmentation of Robot Simulation Representations |
|
| Serifi, Agon | ETH Zurich |
| Knoop, Espen | The Walt Disney Company |
| Schumacher, Christian | Disney Research |
| Kumar, Naveen | The Walt Disney Company |
| Gross, Markus | ETH Zurich |
| B�cher, Moritz | Disney Research |
Keywords: Deep Learning Methods, Simulation and Animation, Machine Learning for Robot Control
Abstract: Simulation representations of robots have advanced in recent years. Yet, there remain significant sim-to-real gaps because of modeling assumptions and hard-to-model behaviors such as friction. In this letter, we propose to augment common simulation representations with a transformer-inspired architecture, by training a network to predict the true state of robot building blocks given their simulation state. Because we augment building blocks, rather than the full simulation state, we make our approach modular which improves generalizability and robustness. We use our neural network to augment the state of robot actuators, and also of rigid body states. Our actuator augmentation generalizes well across robots, and our rigid body augmentation results in improvements even under high uncertainty in model parameters.
|
| |
| MoBIP Interactive session, Hall E |
|
| Poster M2 |
|
| |
| |
| Subsession MoBIP-01, Hall E | |
| Clone of 'Task Planning' Regular session, 14 papers |
| |
| Subsession MoBIP-02, Hall E | |
| Clone of 'Prosthesis Design and Control' Regular session, 12 papers |
| |
| Subsession MoBIP-03, Hall E | |
| Clone of 'Collision Avoidance II' Regular session, 13 papers |
| |
| Subsession MoBIP-04, Hall E | |
| Clone of 'Motion Control' Regular session, 13 papers |
| |
| Subsession MoBIP-05, Hall E | |
| Clone of 'Mechanism Design II' Regular session, 14 papers |
| |
| Subsession MoBIP-06, Hall E | |
| Clone of 'Modeling, Control, and Learning for Soft Robots II' Regular session, 13 papers |
| |
| Subsession MoBIP-07, Hall E | |
| Clone of 'Micro and Nano Robotics' Regular session, 14 papers |
| |
| Subsession MoBIP-08, Hall E | |
| Clone of 'Legged Robots II' Regular session, 12 papers |
| |
| Subsession MoBIP-09, Hall E | |
| Clone of 'Motion and Path Planning II' Regular session, 13 papers |
| |
| Subsession MoBIP-10, Hall E | |
| Clone of 'Learning for Manipulation II' Regular session, 12 papers |
| |
| Subsession MoBIP-11, Hall E | |
| Clone of 'Aerial Systems - Applications II' Regular session, 14 papers |
| |
| Subsession MoBIP-12, Hall E | |
| Clone of 'Perception for Grasping and Manipulation II' Regular session, 12 papers |
| |
| Subsession MoBIP-13, Hall E | |
| Clone of 'Computer Vision for Automation' Regular session, 14 papers |
| |
| Subsession MoBIP-14, Hall E | |
| Clone of 'Localization II' Regular session, 13 papers |
| |
| Subsession MoBIP-15, Hall E | |
| Clone of 'Visual SLAM' Regular session, 13 papers |
| |
| Subsession MoBIP-16, Hall E | |
| Clone of 'AI-Enabled Robotics' Regular session, 14 papers |
| |
| Subsession MoBIP-17, Hall E | |
| Clone of 'Learning from Demonstration' Regular session, 14 papers |
| |
| Subsession MoBIP-18, Hall E | |
| Clone of 'Human Detection and Pose' Regular session, 13 papers |
| |
| Subsession MoBIP-19, Hall E | |
| Clone of 'Deep Learning Methods II' Regular session, 12 papers |
| |
| 15:30-17:00, Subsession MoBIP-20, Hall E | |
| Late Breaking Posters II Late breaking, 32 papers |
| |
| MoBIP-01 Regular session, Hall E |
Add to My Program |
| Clone of 'Task Planning' |
|
| |
| |
| 15:30-17:00, Paper MoBIP-01.1 | Add to My Program |
| Learning a Causal Transition Model for Object Cutting |
|
| Zhang, Zeyu | Beijing Institute for General Artificial Intelligence |
| Han, Muzhi | University of California, Los Angeles |
| Jia, Baoxiong | Beijing Institute for General Artificial Intelligence |
| Jiao, Ziyuan | Beijing Institute for General Artificial Intelligence |
| Zhu, Yixin | Peking University |
| Zhu, Song-Chun | UCLA |
| Liu, Hangxin | Beijing Institute for General Artificial Intelligence (BIGAI) |
Keywords: Task Planning, Simulation and Animation
Abstract: Cutting objects into desired fragments is challenging for robots due to the spatially unstructured nature of fragments and the complex one-to-many object fragmentation caused by actions. We present a novel approach to model object fragmentation using an attributed stochastic grammar. This grammar abstracts fragment states as node variables and captures causal transitions in object fragmentation through production rules. We devise a probabilistic framework to learn this grammar from human demonstrations. The planning process for object cutting involves inferring an optimal parse tree of desired fragments using the learned grammar, with parse tree productions corresponding to cutting actions. We employ Monte Carlo Tree Search (MCTS) to efficiently approximate the optimal parse tree and generate a sequence of executable cutting actions. The experiments demonstrate the efficacy in planning for object-cutting tasks, both in simulation and on a physical robot. The proposed approach outperforms several baselines by demonstrating superior generalization to novel setups, thanks to the compositionality of the grammar model.
|
| |
| 15:30-17:00, Paper MoBIP-01.2 | Add to My Program |
| Object Rearrangement Planning for Target Retrieval in a Confined Space with Lateral View |
|
| Kang, Minjae | Seoul National University (SNU) |
| Kim, Junseok | Seoul National University |
| Kee, Hogun | Seoul National University |
| Oh, Songhwai | Seoul National University |
Keywords: Task and Motion Planning, Manipulation Planning, Deep Learning in Grasping and Manipulation
Abstract: In this paper, we perform an object rearrangement task for target retrieval in an environment with a confined space and limited observation directions. The agent must create a collision-free path to bring out the target object by relocating the surrounding objects using the prehensile action, i.e., pick-and-place. Object rearrangement in a confined space is a non-monotone problem, and finding a valid plan within a reasonable time is challenging. We propose a novel algorithm that divides the target retrieval task, which requires a long sequence of actions, into sequential sub-problems and explores each solution through subgoal-conditioned Monte Carlo tree search (MCTS). In the experiment, we verify that the proposed algorithm can find safe rearrangement plans with various objects efficiently compared to the existing planning methods. Furthermore, we show that the proposed method can be transferred to a real robot experiment without additional training.
|
| |
| 15:30-17:00, Paper MoBIP-01.3 | Add to My Program |
| Learning Type-Generalized Actions for Symbolic Planning |
|
| Tanneberg, Daniel | Honda Research Institute |
| Gienger, Michael | Honda Research Institute Europe |
Keywords: Representation Learning, Task Planning
Abstract: Symbolic planning is a powerful technique to solve complex tasks that require long sequences of actions and can equip an intelligent agent with complex behavior. The downside of this approach is the necessity for suitable symbolic representations describing the state of the environment as well as the actions that can change it. Traditionally such representations are carefully hand-designed by experts for distinct problem domains, which limits their transferability to different problems and environment complexities. In this paper, we propose a novel concept to generalize symbolic actions using a given entity hierarchy and observed similar behavior. In a simulated grid-based kitchen environment, we show that type- generalized actions can be learned from few observations and generalize to novel situations. Incorporating an additional on-the-fly generalization mechanism during planning, unseen task combinations, involving longer sequences, novel entities and unexpected environment behavior, can be solved.
|
| |
| 15:30-17:00, Paper MoBIP-01.4 | Add to My Program |
| CAR-DESPOT: Causally-Informed Online POMDP Planning for Robots in Confounded Environments |
|
| Cannizzaro, Ricardo | Oxford Robotics Institute |
| Kunze, Lars | University of Oxford |
Keywords: Task Planning, Probabilistic Inference
Abstract: Robots operating in real-world environments must reason about possible outcomes of stochastic actions and make decisions based on partial observations of the true world state. A major challenge for making accurate and robust action predictions is the problem of confounding, which if left untreated can lead to prediction errors. The partially observable Markov decision process (POMDP) is a widely-used framework to model these stochastic and partially-observable decision-making problems. However, due to a lack of explicit causal semantics, POMDP planning methods are prone to confounding bias and thus in the presence of unobserved confounders may produce underperforming policies. This paper presents a novel causally-informed extension of "anytime regularized determinized sparse partially observable tree" (AR-DESPOT), a modern anytime online POMDP planner, using causal modelling and inference to eliminate errors caused by unmeasured confounder variables. We further propose a method to learn offline the partial parameterisation of the causal model for planning, from ground truth model data. We evaluate our methods on a toy problem with an unobserved confounder and show that the learned causal model is highly accurate, while our planning method is more robust to confounding and produces overall higher performing policies than AR-DESPOT.
|
| |
| 15:30-17:00, Paper MoBIP-01.5 | Add to My Program |
| Recurrent Macro Actions Generator for POMDP Planning |
|
| Liang, Yuanchu | The Australian National University |
| Kurniawati, Hanna | Australian National University |
Keywords: Task and Motion Planning, AI-Based Methods, Probability and Statistical Methods
Abstract: Many planning problems in robotics require long planning horizon and uncertain in nature. The Partially Observable Markov Descision Process (POMDP) is a mathematically principled framework for planning under uncertainty. To alleviate the difficulties of computing good approximate POMDP solutions for long horizon problems, one often plans using macro actions, where each macro action is a chain of primitive actions. Such a strategy reduces the effective planning horizon of the problem, and hence reduces the computational complexity for solving. The difficulty is in generating a set of suitable macro actions. In this paper, we present a simple recurrent neural network that learns to generate suitable sets of candidate macro actions that exploits environment information.Key to this learning method is to represent the raw partial information from the environment as a latent problem instance,and sequentially generate macro actions conditioned on the past information. We compare our proposed method with state-of-the-art [1] on four different long horizon planning tasks with various difficulties. The results indicate the quality of the policies computed using macro actions generated by our proposed method consistently exceeds benchmarks. Our implementation can be accessed at https://github.com/YC-Liang/Recurrent-Macro-Action-Generator.
|
| |
| 15:30-17:00, Paper MoBIP-01.6 | Add to My Program |
| Task Planning and Motion Control with Temporal Logic Specifications |
|
| Pereira, Marcos S. | Universidade Federal De Minas Gerais |
| Pimenta, Luciano | Universidade Federal De Minas Gerais |
| Adorno, Bruno Vilhena | The University of Manchester |
Keywords: Task and Motion Planning, Formal Methods in Robotics and Automation, Motion Control
Abstract: This paper proposes a task planning and motion control framework that generates task plans for a linear temporal logic specification (LTL), which are then executed using a task-space constrained motion controller and a local task planner that overcomes local minima. We propose a new encoding for task specifications, directly in the task-space, as constraints of a mixed-integer linear program that can be used with off-the-shelf LTL linear encoding. We apply our framework to plan and execute trajectories for a free-flying robot and show that the task plan is accomplished without collisions, even in the presence of unexpected moving obstacles that are not considered in the planning phase, while control signal constraints are satisfied. To evaluate the local minima avoidance, we compare the local task planner with a sampling-based motion planner, and the results show a smoother trajectory with a faster execution and less total planning time when using our framework. Last, our framework scaled well with a longer LTL specification, as opposed to automata-based frameworks that usually suffer with the curse of the dimensionality.
|
| |
| 15:30-17:00, Paper MoBIP-01.7 | Add to My Program |
| Simultaneous Action and Grasp Feasibility Prediction for Task and Motion Planning through Multi-Task Learning |
|
| Ait Bouhsain, Smail | LAAS-CNRS |
| Alami, Rachid | CNRS |
| Simeon, Thierry | LAAS-CNRS |
Keywords: Task and Motion Planning, Deep Learning in Grasping and Manipulation, Manipulation Planning
Abstract: In this paper, we address task and motion planning (TAMP) which is an important yet challenging robotics problem. It is known to suffer from the high combinatorial complexity of discrete search, often requiring a large number of geometric planning calls. We build upon recent works in TAMP by taking advantage of learning methods to provide action feasibility information as a heuristic to the symbolic planner, thus guiding it to a geometrically feasible solution and reducing geometric planning time. We propose AGFP-Net, a multi-task neural network predicting not only action feasibility, but also the feasibility of a set of grasp types. We also propose an improved feasibility-informed TAMP algorithm capable of solving more complex problems, and handling goals which are not fully specified. Comparative results obtained on different problems of varying complexity show that our method is able to greatly reduce task and motion planning time.
|
| |
| 15:30-17:00, Paper MoBIP-01.8 | Add to My Program |
| Differentiable Task Assignment and Motion Planning |
|
| Envall, Jimmy | ETH Zurich |
| Poranne, Roi | University of Haifa |
| Coros, Stelian | ETH Zurich |
Keywords: Task and Motion Planning, Cooperating Robots, Manipulation Planning
Abstract: Task and motion planning is one of the key problems in robotics today. It is often formulated as a discrete task allocation problem combined with continuous motion planning. Many existing approaches to TAMP involve explicit descriptions of task primitives that cause discrete changes in the kinematic relationship between the actor and the objects. In this work we propose an alternative, fully differentiable approach which supports a large number of TAMP problem instances. Rather than explicitly enumerating task primitives, actions are instead represented implicitly as part of the solution to a nonlinear optimization problem. We focus on decision making for robotic manipulators, specifically for pick and place tasks, and explore the efficacy of the model through a number of simulated experiments including multiple robots, objects and interactions with the environment. We also show several possible extensions.
|
| |
| 15:30-17:00, Paper MoBIP-01.9 | Add to My Program |
| Effectively Rearranging Heterogeneous Objects on Cluttered Tabletops |
|
| Gao, Kai | Rutgers University |
| Yu, Justin | Rutgers University |
| Punjabi, Tanay Sandeep | Rutgers |
| Yu, Jingjin | Rutgers University |
Keywords: Task Planning, Manipulation Planning, Logistics
Abstract: Effectively rearranging heterogeneous objects constitutes a high-utility skill that an intelligent robot should master. Whereas significant work has been devoted to the grasp synthesis of heterogeneous objects, little attention has been given to the planning for sequentially manipulating such objects. In this work, we examine the long-horizon sequential rearrangement of heterogeneous objects in a tabletop setting, addressing not just generating feasible plans but near-optimal ones. Toward that end, and building on previous methods, including combinatorial algorithms and Monte Carlo tree search-based solutions, we develop state-of-the-art solvers for optimizing two practical objective functions considering key object properties such as size and weight. Thorough simulation studies show that our methods provide significant advantages in handling challenging heterogeneous object rearrangement problems, especially in cluttered settings. Real robot experiments further demonstrate and confirm these advantages. Source code and evaluation data associated with this research will be available at https://github.com/arc-l/TRLB upon the publication of this manuscript.
|
| |
| 15:30-17:00, Paper MoBIP-01.10 | Add to My Program |
| Semantics-Aware Mission Adaptation for Autonomous Exploration in Urban Environments |
|
| Moon, Sangwoo | Jet Propulsion Laboratory, NASA |
| Peltzer, Oriana | Stanford University |
| Ott, Joshua | Stanford University |
| Kim, Sung-Kyun | NASA Jet Propulsion Laboratory, Caltech |
| Agha-mohammadi, Ali-akbar | NASA-JPL, Caltech |
Keywords: Task Planning, Task and Motion Planning, Planning, Scheduling and Coordination
Abstract: Robust mission planning is an essential component for mission autonomy to perform complicated tasks in extreme environments. In this paper, we are interested in the role of semantic abstractions for guiding autonomous mission planning. In particular, we focus on how semantics can be leveraged to transition, at the mission level, in between individually robust task plans. We present a mission autonomy framework wherein a task plan adaptation policy leverages up-to-date semantics information in order to adapt to changes that occur during run-time, which endows the robot with better resiliency to unexpected events and improves the overall efficiency of mission operations. Under this new perspective, we provide a concrete and challenging application of autonomous exploration and radio source seeking in a complex multi-level building environment. Experimental results over simulations and real hardware tests demonstrate that the presented semantics-aware mission adaptation more effectively completes the mission with better qualitative results compared to a non-adaptive baseline.
|
| |
| 15:30-17:00, Paper MoBIP-01.11 | Add to My Program |
| Optimal Cost-Preference Trade-Off Planning with Multiple Temporal Tasks |
|
| Amorese, Peter | University of Colorado Boulder |
| Lahijanian, Morteza | University of Colorado Boulder |
Keywords: Task Planning, Task and Motion Planning, Motion and Path Planning
Abstract: Autonomous robots are increasingly utilized in realistic scenarios with multiple complex tasks. In these scenarios, there may be a preferred way of completing all of the given tasks, but it is often in conflict with optimal execution. Recent work studies preference-based planning, however, they have yet to extend the notion of preference to the behavior of the robot with respect to each task. In this work, we introduce a novel notion of preference that provides a generalized framework to express preferences over individual tasks as well as their relations. Then, we perform an optimal trade-off (Pareto) analysis between behaviors that adhere to the user's preference and the ones that are resource optimal. We introduce an efficient planning framework that generates Pareto-optimal plans given user's preference by extending A* search. Further, we show a method of computing the entire Pareto front (the set of all optimal trade-offs) via an adaptation of a multi-objective A* algorithm. We also present a problem-agnostic search heuristic to enable scalability. We illustrate the power of the framework on both mobile robots and manipulators. Our benchmarks show the effectiveness of the heuristic with up to 2-orders of magnitude speedup.
|
| |
| 15:30-17:00, Paper MoBIP-01.12 | Add to My Program |
| Optimal and Stable Multi-Layer Object Rearrangement on a Tabletop |
|
| Xu, Andy | Rutgers University |
| Gao, Kai | Rutgers University |
| Feng, Si Wei | Rutgers University |
| Yu, Jingjin | Rutgers University |
Keywords: Task Planning, Assembly, Manipulation Planning
Abstract: Object rearrangement is a fundamental sub-task in accomplishing a great many physical tasks. As such, effectively executing rearrangement is an important skill for intelligent robots to master. In this study, we conduct the first algorithmic study on optimally solving the problem of Multi-layer Object Rearrangement on a Tabletop (MORT), in which one object may be relocated at a time, and an object can only be moved if other objects do not block its top surface. In addition, any intermediate structure during the reconfiguration process must be physically stable, i.e., it should stand without external support. To tackle the dual challenges of untangling the dependencies between objects and ensuring structural stability, we develop an algorithm that interleaves the computation of the optimal rearrangement plan and structural stability checking. Using a carefully constructed integer linear programming (ILP) model, our algorithm, Stability-aware Integer Programming-based Planner (SIPP), readily scales to optimally solve complex rearrangement problems of 3D structures with over 60 building blocks, with solution quality significantly outperforming natural greedy best-first approaches. Upon the publication of the manuscript, source code and data will be available at https://github.com/arc-l/mort/
|
| |
| 15:30-17:00, Paper MoBIP-01.13 | Add to My Program |
| Task and Motion Planning with Large Language Models for Object Rearrangement |
|
| Ding, Yan | SUNY Binghamton |
| Zhang, Xiaohan | SUNY Binghamton |
| Paxton, Chris | Meta AI |
| Zhang, Shiqi | SUNY Binghamton |
Keywords: Task and Motion Planning, Service Robotics
Abstract: Multi-object rearrangement is a crucial skill for service robots, and commonsense reasoning is frequently needed in this process. However, achieving commonsense arrangements requires knowledge about objects, which is hard to transfer to robots. Large language models (LLMs) are one potential source of this knowledge, but they do not naively capture information about plausible physical arrangements of the world. We propose LLM-GROP, which uses prompting to extract commonsense knowledge about functional, semantically valid object configurations from an LLM, and instantiates them with a task and motion planner in order to generalize to varying scene geometry. LLM-GROP allows us to go from natural-language commands to human-aligned object rearrangement in varied environments. Based on human evaluations, our approach achieved the highest rating while outperforming competitive baselines in terms of success rate while maintaining comparable cumulative action costs. Finally, we demonstrate a practical implementation of LLM-GROP on a mobile manipulator in real-world scenarios.
|
| |
| 15:30-17:00, Paper MoBIP-01.14 | Add to My Program |
| Synergistic Task and Motion Planning with Reinforcement Learning-Based Non-Prehensile Actions |
|
| Liu, Gaoyuan | Vrije Universiteit Brussel |
| De Winter, Joris | Vrije Universiteit Brussel |
| Steckelmacher, Denis | Vrije Universiteit Brussel |
| Hota, Roshan Kumar | Indian Institute of Technology, Kharagpur, India |
| Now�, Ann | VUB |
| Vanderborght, Bram | Vrije Universiteit Brussel |
Keywords: Task and Motion Planning, Reinforcement Learning, Manipulation Planning
Abstract: Robotic manipulation in cluttered environments requires synergistic planning among prehensile and non- prehensile actions. Previous works on sampling-based Task and Motion Planning (TAMP) algorithms, e.g. PDDLStream, provide a fast and generalizable solution for multi-modal manipulation. However, they are likely to fail in cluttered scenarios where no collision-free grasping approaches can be sampled without preliminary manipulations. To extend the ability of sampling-based algorithms, we integrate a vision- based Reinforcement Learning (RL) non-prehensile procedure, pusher. The pushing actions generated by pusher can eliminate interlocked situations and make the grasping problem solvable. Also, the sampling-based algorithm evaluates the pushing ac- tions by providing rewards in the training process, thus the pusher can learn to avoid situations leading to irreversible failures. The proposed hybrid planning method is validated on a cluttered bin-picking problem and implemented in both simulation and real world. Results show that the pusher can effectively improve the success ratio of the previous sampling- based algorithm, while the sampling-based algorithm can help the pusher learn pushing skills.
|
| |
| MoBIP-02 Regular session, Hall E |
Add to My Program |
| Clone of 'Prosthesis Design and Control' |
|
| |
| |
| 15:30-17:00, Paper MoBIP-02.1 | Add to My Program |
| Improving Amputee Endurance Over Activities of Daily Living with a Robotic Knee-Ankle Prosthesis: A Case Study |
|
| Best, T. Kevin | University of Michigan |
| Laubscher, Curt A. | University of Michigan |
| Cortino, Ross | University of Michigan |
| Cheng, Shihao | University of Michigan, Ann Arbor |
| Gregg, Robert D. | University of Michigan |
Keywords: Prosthetics and Exoskeletons, Rehabilitation Robotics
Abstract: Robotic knee-ankle prostheses have often fallen short relative to passive microprocessor prostheses in time-based clinical outcome tests. User ambulation endurance is an alternative clinical outcome metric that may better highlight the benefits of robotic prostheses. However, previous studies were unable to show endurance benefits due to inaccurate high-level classification, discretized mid-level control, and insufficiently difficult ambulation tasks. In this case study, we present a phase-based mid-level prosthesis controller which yields biomimetic joint kinematics and kinetics that adjust to suit a continuum of tasks. We enrolled an individual with an above-knee amputation and challenged him to perform repeated, rapid laps of a circuit comprising activities of daily living with both his passive prosthesis and a robotic prosthesis. The participant demonstrated improved endurance with the robotic prosthesis and our mid-level controller compared to his passive prosthesis, completing over twice as many total laps before fatigue and muscle discomfort required him to stop. We also show that time-based outcome metrics fail to capture this endurance improvement, suggesting that alternative metrics related to endurance and fatigue may better highlight the clinical benefits of robotic prostheses.
|
| |
| 15:30-17:00, Paper MoBIP-02.2 | Add to My Program |
| Controlling Powered Prosthesis Kinematics Over Continuous Transitions between Walk and Stair Ascent |
|
| Cheng, Shihao | University of Michigan, Ann Arbor |
| Laubscher, Curt A. | University of Michigan |
| Gregg, Robert D. | University of Michigan |
Keywords: Prosthetics and Exoskeletons, Wearable Robotics, Motion Control
Abstract: One of the primary benefits of emerging powered prosthetic legs is their ability to facilitate step-over-step stair ascent by providing positive mechanical work. Existing control methods typically have distinct steady-state activity modes for walking and stair ascent, where activity transitions involve discretely switching between controllers and often must be initiated with a particular leg. However, these discrete transitions do not necessarily replicate able-bodied joint biomechanics, which have been shown to continuously adjust over a transition stride. This paper presents a phase-based kinematic controller for a powered knee-ankle prosthesis that enables continuous, biomimetic transitions between walking and stair ascent. The controller tracks joint angles from a data-driven kinematic model that continuously interpolates between the steady-state kinematic models, and it allows both the prosthetic and intact leg to lead the transitions. Results from experiments with two transfemoral amputee participants indicate that knee and ankle kinematics smoothly transition between walking and stair ascent, with comparable or lower root mean square errors compared to variations from able-bodied data.
|
| |
| 15:30-17:00, Paper MoBIP-02.3 | Add to My Program |
| Calibration of a Tibia-Based Phase Variable for Control of Robotic Transtibial Prostheses |
|
| Posh, Ryan | University of Notre Dame |
| Tittle, Jonathan Allen | University of Notre Dame |
| Schmiedeler, James | University of Notre Dame |
| Wensing, Patrick M. | University of Notre Dame |
Keywords: Prosthetics and Exoskeletons
Abstract: Phase variable control based on global tibia kinematics holds promise for predicting gait cycle progression to continuously control robotic transtibial prostheses. Calibration of the phase variable is critical to ensure its monotonic behavior, to approach a linear relationship with gait percentage, and to accurately predict the percentage of gait. This paper compares four calibration approaches using data from 22 able-bodied subjects walking at 14 speeds [1]. The typical pure centering (PC) approach employed for thigh-based phase variables is not viable, yielding monotonic phase progression in fewer than half of the cases. An optimization (OPT) approach found monotonic calibrations in 305/308 cases with high linearity (average R2 of 0.91). Critical point centering (CPC) approximates the OPT performance, with 274/308 monotonic calibrations and an average R2 of 0.85, whereas the related vertical weighted average (VWA) approach was only slightly better than PC. All four approaches are similarly accurate in predicting gait percentage, staying within 5% at least 92.7% of the time.
|
| |
| 15:30-17:00, Paper MoBIP-02.4 | Add to My Program |
| On Intuitive Control of Ankle-Foot Prostheses: A Sensor Fusion-Based Algorithm for Real-Time Prediction of Transitions to Compliant Surfaces |
|
| Angelidou, Charikleia | University of Delaware |
| Artemiadis, Panagiotis | University of Delaware |
Keywords: Prosthetics and Exoskeletons
Abstract: Substantial research and development on the design and control of robotic ankle-foot prostheses have aimed to restore normal function and movement capacity for people with gait impairments and lower limb amputations. However, prostheses controllers usually fail to incorporate information pertaining to the properties of the walking terrain, such as ground stiffness. There is therefore a need for a framework that adjusts the prostheses parameters according to the user�s intent to transition to a variable impedance terrain. To achieve this, we need to incorporate the human wearer in the control loop of the prosthesis. This work proposes an advanced, high-level controller framework for powered ankle-foot prostheses that combines subject-specific pattern recognition (PR) and classification strategies to predict whether the next step will be on a rigid or compliant surface. Comparing the Support Vector Machine (SVM) and k-Nearest Neighbors (k-NN) classification algorithms for this task, we conclude that by combining a k- NN implementation with a Pattern Recognition Neural Network (PR NN), our method can accurately forecast upcoming surface stiffness transitions in time to allow for prompt adaptation to the new walking terrain. We also show that the sensor fusion of kinematic and surface electromyographic (EMG) data outperforms single-source inputs producing the best prediction results for all subjects with an accuracy of up to 87.5%.
|
| |
| 15:30-17:00, Paper MoBIP-02.5 | Add to My Program |
| Powered Knee and Ankle Prosthesis Control for Adaptive Ambulation at Variable Speeds, Inclines, and Uneven Terrains |
|
| Sullivan, Liam | University of Utah |
| Creveling, Suzi | University of Utah |
| Cowan, Marissa | University of Utah |
| Gabert, Lukas | University of Utah |
| Lenzi, Tommaso | University of Utah |
Keywords: Prosthetics and Exoskeletons, Rehabilitation Robotics, Wearable Robotics
Abstract: Ambulation in everyday life requires walking at variable speeds, variable inclines, and variable terrains. Powered prostheses aim to provide this adaptability through control of the actuated joints. Some powered prosthesis controllers can adapt to discrete changes in speed and incline but require manual tuning to determine the control parameters, leading to poor clinical viability. Other data-driven controllers can continuously adapt to changes in speed and incline but do so by imposing the same non-amputee gait patterns for all amputee subjects, which does not consider subjective preferences and differing clinical needs of users. Here, we present a controller for powered knee and ankle prostheses that can continuously adapt to different walking speeds, inclines, and uneven terrains without enforcing a specific prosthesis position, impedance, or torque. A virtual biarticular muscle connection determines the knee flexion torque, which changes with both speed and slope. Adaptation to inclines and uneven terrains is based solely on the global shank orientation. Continuously variable damping allows for speed adaptation. Minimum-jerk programming defines the prosthesis swing trajectory at variable cadences. Experiments with one individual with an above-knee amputation suggest that the proposed controller can effectively adapt to different walking speeds, inclines, and rough terrains.
|
| |
| 15:30-17:00, Paper MoBIP-02.6 | Add to My Program |
| Motor Unit Action Potential Based Classification of Hand and Arm Motions |
|
| Twardowski, Michael | Delsys & Altec Inc |
| Chan, Michael | Delsys Inc |
| Li, Zhi | Worcester Polytechnic Institute |
| De Luca, Gianluca | Delsys Inc |
| Kline, Joshua | Delsys & Altec Inc |
| Chiodini, John | Delsys Inc |
Keywords: Brain-Machine Interfaces, Motion Control, Prosthetics and Exoskeletons
Abstract: While motion classification architectures have improved in accuracy and robustness in recent years, computationally expensive approaches and sophisticated hardware dependencies limit their real-world applicability. To overcome these challenges, we have designed a lightweight, real-time architecture for classifying motions of the arm & hand using features derived from motor unit action potentials within surface Electromyographic (sEMG) signals, rather than which provide direct interrogation of underlying muscle activation patterns. We tested the architecture on 6 motions performed dynamically across a range of muscle contraction intensities achieving median classification accuracies ranging from 91.3% to 93.3% and an average processing time of approximately 40 ms across three different classifiers. Taken together, our findings demonstrate potential robustness of motor unit based neural interfaces for motion classification tasks.
|
| |
| 15:30-17:00, Paper MoBIP-02.7 | Add to My Program |
| Adjusting the Quasi-Stiffness of an Ankle-Foot Prosthesis Improves Walking Stability During Locomotion Over Compliant Terrain |
|
| Karakasis, Chrysostomos | University of Delaware, Mechanical Engineering Department |
| Salati, Robert | University of Delaware |
| Artemiadis, Panagiotis | University of Delaware |
Keywords: Prosthetics and Exoskeletons
Abstract: Despite significant advances in the design of robotic lower-limb prostheses for individuals with impaired mobility, there is a need for further progress in improving the robustness, safety, and stability of these devices in a wide range of activities of daily living. Although powered prostheses have been able to adapt to different speeds, conditions, and rigid terrains, no control strategies have been proposed for addressing walking over compliant surfaces. This work proposes a continuous admittance controller that adjusts the ankle quasi-stiffness of a powered ankle-foot prosthesis and improves gait stability during locomotion over compliant terrain. The proposed controller is evaluated with walking experiments on an instrumented treadmill that can accurately change the walking surface stiffness. In these experiments, the proposed controller accurately changes the prosthesis ankle quasi-stiffness across a wide range of 10 − 20 Nm/deg, while improving local dynamic deg stability compared to a standard phase-variable controller. The proposed controller can significantly improve the performance of lower-limb prostheses in dynamic and compliant environments frequently encountered in daily activities, resulting in improved quality of life for people with lower-limb amputation.
|
| |
| 15:30-17:00, Paper MoBIP-02.8 | Add to My Program |
| A Unified Controller for Natural Ambulation on Stairs and Level Ground with a Powered Robotic Knee Prosthesis |
|
| Cowan, Marissa | University of Utah |
| Creveling, Suzi | University of Utah |
| Sullivan, Liam | University of Utah |
| Gabert, Lukas | University of Utah |
| Lenzi, Tommaso | University of Utah |
Keywords: Prosthetics and Exoskeletons, Rehabilitation Robotics, Wearable Robotics
Abstract: Powered lower-limb prostheses have the potential to improve amputee mobility by closely imitating the biomechanical function of the missing biological leg. To accomplish this goal, powered prostheses need controllers that can seamlessly adapt to the ambulation activity intended by the user. Most powered prosthesis control architectures address this issue by switching between specific controllers for each activity. This approach requires online classification of the intended ambulation activity. Unfortunately, any misclassification can cause the prosthesis to perform a different movement than the user expects, increasing the likelihood of falls and injuries. Therefore, classification approaches require near-perfect accuracy to be used safely in real life. In this paper, we propose a unified controller for powered knee prostheses which allows for walking, stair ascent, and stair descent without the need for explicit activity classification. Experiments with one individual with an above-knee amputation show that the proposed controller enables seamless transitions between activities. Moreover, transition between activities is possible while leading with either the sound-side or the prosthesis. A controller with these characteristics has the potential to improve amputee mobility.
|
| |
| 15:30-17:00, Paper MoBIP-02.9 | Add to My Program |
| Volitional EMG Control Enables Stair Climbing with a Robotic Powered Knee Prosthesis |
|
| Creveling, Suzi | University of Utah |
| Cowan, Marissa | University of Utah |
| Sullivan, Liam | University of Utah |
| Gabert, Lukas | University of Utah |
| Lenzi, Tommaso | University of Utah |
Keywords: Prosthetics and Exoskeletons, Wearable Robotics, Cyborgs
Abstract: Existing controllers for robotic powered prostheses regulate the prosthesis speed, timing, and energy generation using predefined position or torque trajectories. This approach enables climbing stairs step-over-step. However, it does not provide amputees with direct volitional control of the robotic prosthesis, a functionality necessary to restore full mobility to the user. Here we show that proportional electromyographic (EMG) control of the prosthesis knee torque enables volitional control of a powered knee prosthesis during stair climbing. The proposed EMG controller continuously regulates knee torque based on activation of the residual hamstrings, measured using a single EMG electrode located within the socket. The EMG signal is mapped to a desired knee flexion/extension torque based on the prosthesis knee position, the residual limb position, and the interaction with the ground. As a result, the proposed EMG controller enabled an above-knee amputee to climb stairs at different speeds, while carrying additional loads, and even backwards. By enabling direct, volitional control of powered robotic knee prostheses, the proposed EMG controller has the potential to improve amputee mobility in the real world.
|
| |
| 15:30-17:00, Paper MoBIP-02.10 | Add to My Program |
| Development and Online Validation of an Intrinsic Fault Detector for a Powered Robotic Knee Prosthesis |
|
| Naseri, Amirreza | North Carolina State University |
| Liu, Ming | North Carolina State University |
| Lee, I-Chieh | UNC/NCSU Joint Department of Biomedical Engineering |
| Huang, He (Helen) | North Carolina State University and University of North Carolina |
Keywords: Prosthetics and Exoskeletons, Safety in HRI, Physical Human-Robot Interaction
Abstract: Robotic prosthetic legs have the potential to significantly improve the quality of life for lower limb amputees to perform locomotion in various environments and task conditions. However, these devices lack the capability to recover from internal intrinsic control faults, which can lead to harmful consequences affecting the user�s gait performance and eroding trust in these robotic devices. Therefore, a reliable fault detection system is necessary to detect intrinsic faults in a timely manner and provide a compensatory response to mitigate their effects. This paper focuses on designing an active fault detector for a robotic knee prosthesis and demonstrates its effectiveness in real time. The developed system utilizes a Gaussian Process model to estimate knee angular velocity, which is sensitive to intrinsic faults and relies on the difference between estimated velocity and the actual measurement to detect internal control faults. In an offline analysis, the developed detector demonstrated a higher detection rate, lower false alarm ratio, and faster detection time compared with the two approaches reported previously. An online demonstration was also conducted with a unilateral amputee participant and showed performance similar to that of offline analysis. We expect that this detector can be integrated into a fault tolerance strategy to enhance the reliability and safety of robotic prosthetic legs, enabling users to perform their everyday tasks with greater confidence.
|
| |
| 15:30-17:00, Paper MoBIP-02.11 | Add to My Program |
| A Feasibility Study of Piecewise Phase Variable Based on Variable Toe-Off for the Powered Prosthesis Control: A Case Study |
|
| Hong, Woolim | North Carolina State University |
| Anil Kumar, Namita | Johnson and Johnson |
| Patrick, Shawanee | Texas A&M |
| Moon, Sunwoong | Gwangju Institute of Science and Technology |
| Hur, Pilwon | Gwangju Institute of Science and Technology |
Keywords: Prosthetics and Exoskeletons, Wearable Robotics, Rehabilitation Robotics
Abstract: To achieve stable walking and provide proper assistance, it is crucial to have synchronized control of the prosthesis, treating the user and the prosthesis as a coupled system. Additionally, speed adaptability is important for controlling the prosthesis at different walking speeds. One approach to achieving this is by using a phase variable to estimate the user�s gait phase and control the prosthesis in synchrony with the user. However, the current phase variable (i.e., PV) cannot reflect variable toe-off timing at different speeds, although individuals have different toe-off timings per walking speed. To address this issue, we propose a piecewise phase variable (i.e., PW-PV) that can be adjusted for different toe-off timings while estimating the user�s gait phase at various walking speeds. As a case study, we conducted a treadmill walking experiment with two participants (i.e., one healthy and one amputee) using a custom-built powered prosthesis. We collected and analyzed joint kinematics, kinetics, and ground reaction force data to validate the feasibility of the PW-PV. The use of the PW-PV resulted in both participants experiencing faster load transfer and a more natural rollover while walking. This allowed healthy and amputee participants to walk with longer push-off durations of 10.6% and 15.2%, respectively, and greater ankle push-off work of 7.3% and 16.9%. Furthermore, with the PW-PV, the amputee participant demonstrated higher vertical ground reaction forces of 5.4% and 4.7% on her prosthesis side leg during load acceptance and push-off periods, potentially suggesting increased confidence in using the prosthesis. We anticipate that by using the proposed phase variable, we will be able to provide more appropriate and timely assistance to individuals at variable walking speeds.
|
| |
| 15:30-17:00, Paper MoBIP-02.12 | Add to My Program |
| A Wearable Force-Sensitive and Body-Aware Exoprosthesis for a Transhumeral Prosthesis Socket (I) |
|
| Toedtheide, Alexander | Technical University of Munich, Chair of Robotics and Systems In |
| Pozo Fortunić, Edmundo | Technical University of Munich |
| Kuehn, Johannes | Technical University of Munich |
| Jensen, Elisabeth Rose | Technical University of Munich |
| Haddadin, Sami | Technical University of Munich |
Keywords: Prosthetics and Exoskeletons, Wearable Robots, Mechanism Design, Haptics and Haptic Interfaces
Abstract: Upper limb prostheses are commonly mounted to the human residual limb by a passive socket. By this design, the sensitive residual limb is exposed to reaction wrenches, which can be a source of medical complications. In this work, we introduce an active force-sensitive robotic socket which carries the prosthesis, offloads the residual limb and allows guidance via small interaction forces at the same time. We investigate the feasibility of this concept by a force-sensitive and wearable shoulder exoskeleton, called exoprosthesis when being combined with a prosthesis. We provide a first mechatronics prototype, two floating base controllers and an analysis of the loads acting on the user. Simulations and experiments confirmed the concept and revealed that the wrench at residual limb can be compensated for the static case and by 50% for the investigated motions. Human-in-the-loop tests were successfully performed by three able-bodied users showing a real world use case in a complex grasping situation. Overall, we believe that a force-sensitive robotic socket has the potential to advance prosthetics to a new level as it provides an intuitive and seamless user control interface.
|
| |
| MoBIP-03 Regular session, Hall E |
Add to My Program |
| Clone of 'Collision Avoidance II' |
|
| |
| |
| 15:30-17:00, Paper MoBIP-03.1 | Add to My Program |
| AdaptiveON: Adaptive Outdoor Local Navigation Method for Stable and Reliable Actions |
|
| Liang, Jing | University of Maryland |
| Kulathun Mudiyanselage, Kasun Weerakoon | University of Maryland, College Park |
| Guan, Tianrui | University of Maryland |
| Karapetyan, Nare | University of Maryland |
| Manocha, Dinesh | University of Maryland |
Keywords: Motion and Path Planning, Planning under Uncertainty, Collision Avoidance
Abstract: We present a novel outdoor navigation algorithm to generate stable and efficient actions to navigate a robot to reach a goal. We use a multi-stage training pipeline and show that our approach produces policies that result in stable and reliable robot navigation on complex terrains. Based on the Proximal Policy Optimization (PPO) algorithm, we developed a novel method to achieve multiple capabilities for outdoor local navigation tasks, namely alleviating the robot�s drifting, keeping the robot stable on bumpy terrains, avoiding climbing on hills with steep elevation changes, and avoiding collisions. Our training process mitigates the reality (sim-to-real) gap by introducing generalized environmental and robotic parameters and training with rich features captured from light detection and ranging (Lidar) sensor in a high-fidelity Unity simulator. We evaluate our method in both simulation and real-world environments using Clearpath Husky and Jackal robots. Further, we compare our method against the state-of-the-art approaches and observe that, in the real world, our method improves stability by at least 30.7% on uneven terrains, reduces drifting by 8.08%, and decreases the elevation changes by 14.75%.
|
| |
| 15:30-17:00, Paper MoBIP-03.2 | Add to My Program |
| Intention Communication and Hypothesis Likelihood in Game-Theoretic Motion Planning |
|
| Chahine, Makram | Massachusetts Institute of Technology |
| Firoozi, Roya | Stanford University |
| Xiao, Wei | MIT |
| Schwager, Mac | Stanford University |
| Rus, Daniela | MIT |
Keywords: Path Planning for Multiple Mobile Robots or Agents, Planning under Uncertainty, Robot Safety
Abstract: Game-theoretic motion planners are a potent solu- tion for controlling systems of multiple highly interactive robots. Most existing game-theoretic planners unrealistically assume a priori objective function knowledge is available to all agents. To address this, we propose a fault-tolerant receding horizon game-theoretic motion planner that leverages inter-agent com- munication with intention hypothesis likelihood. Specifically, robots communicate their objective function incorporating their intentions. A discrete Bayesian filter is designed to infer the ob- jectives in real-time based on the discrepancy between observed trajectories and the ones from communicated intentions. In simulation, we consider three safety-critical autonomous driving scenarios of overtaking, lane-merging and intersection crossing, to demonstrate our planner�s ability to capitalize on alternative intention hypotheses to generate safe trajectories in the presence of faulty transmissions in the communication network.
|
| |
| 15:30-17:00, Paper MoBIP-03.3 | Add to My Program |
| Collision-Free Reconfiguration Planning for Variable Topology Trusses Using a Linking Invariant |
|
| Spinos, Alexander | University of Pennsylvania |
| Yim, Mark | University of Pennsylvania |
Keywords: Cellular and Modular Robots, Motion and Path Planning, Computational Geometry
Abstract: We introduce a multi-modal reconfiguration planner for the Variable Topology Truss (VTT) modular robot system. The VTT system is a truss-architecture modular self-reconfigurable robot. When a VTT is restricted to a single topology, the collision constraints between the truss members divide the configuration space into many connected components, which makes collision-free planning difficult. This new planner leverages a mathematical invariant based on link theory to find topological reconfiguration actions that can connect these different regions and make progress towards a goal. We show that this planner is effective at finding paths between configurations with different truss topologies.
|
| |
| 15:30-17:00, Paper MoBIP-03.4 | Add to My Program |
| Hybrid Map-Based Path Planning for Robot Navigation in Unstructured Environments |
|
| Liu, Jiayang | National University of Defense Technology |
| Chen, Xieyuanli | National University of Defense Technology |
| Xiao, Junhao | National University of Defense Technology |
| Sichao, Lin | National University of Defense Technology |
| Zheng, Zhiqiang | National University of Defense Technology |
| Lu, Huimin | National University of Defense Technology |
Keywords: Motion and Path Planning, Collision Avoidance, Autonomous Vehicle Navigation
Abstract: Fast and accurate path planning is important for ground robots to achieve safe and efficient autonomous navigation in unstructured outdoor environments. However, most existing methods exploiting either 2D or 2.5D maps struggle to balance the efficiency and safety for ground robots navigating in such challenging scenarios. In this paper, we propose a novel hybrid map representation by fusing a 2D grid and a 2.5D digital elevation map. Based on it, a novel path planning method is proposed, which considers the robot poses during traversability estimation. By doing so, our method explicitly takes safety as a planning constraint enabling robots to navigate unstructured environments smoothly. The proposed approach has been evaluated on both simulated datasets and a real robot platform. The experimental results demonstrate the efficiency and effectiveness of the proposed method. Compared to state-of-the-art baseline methods, the proposed approach consistently generates safer and easier paths for the robot in different unstructured outdoor environments. The implementation of our method is publicly available at https://github.com/nubot-nudt/T-Hybrid-planner.
|
| |
| 15:30-17:00, Paper MoBIP-03.5 | Add to My Program |
| CDT-Dijkstra: Fast Planning of Globally Optimal Paths for All Points in 2D Continuous Space |
|
| Liu, Jinyuan | Zhejiang University of Technology |
| Fu, Minglei | Zhejiang University of Technology |
| Zhang, Wen-An | Zhejiang University of Technology, China |
| Chen, Bo | Zhejiang University of Technology |
| Prakapovich, Ryhor | United Institute of Informatics Problems of theNationalAcademy O |
| Sychou, Uladzislau | United Institute of Informatics Problems of the NationalAcademy |
Keywords: Motion and Path Planning, Autonomous Vehicle Navigation, Collision Avoidance
Abstract: The Dijkstra algorithm is a classic path planning method, which in a discrete graph space, can start from a specified source node and find the shortest path between the source node and all other nodes in the graph. However, to the best of our knowledge, there is no effective method that achieves a function similar to that of the Dijkstra's algorithm in a continuous space. In this study, an optimal path planning algorithm called convex dissection topology (CDT)-Dijkstra is developed, which can quickly compute the global optimal path from one point to all other points in a 2D continuous space. CDT-Dijkstra is mainly divided into two stages: SetInit and GetGoal. In SetInit, the algorithm can quickly obtain the optimal CDT encoding set of all the cut lines based on the initial point. In GetGoal, the algorithm can return the global optimal path of any goal point at an extremely high speed. In this study, we propose and prove the planning principle of considering only the points on the cutlines, thus reducing the state space of the distance optimal path planning task from 2D to 1D. In addition, we propose a fast method to find the optimal path in a homogeneous class and theoretically prove the correctness of the method. Finally, by testing in a series of environments, the experimental results demonstrate that CDT-Dijkstra not only plans the optimal path from all points at once, but also has a significant advantage over advanced algorithms considering certain complex tasks.
|
| |
| 15:30-17:00, Paper MoBIP-03.6 | Add to My Program |
| Large Scale Pursuit-Evasion under Collision Avoidance Using Deep Reinforcement Learning |
|
| Yang, Helei | Zhejiang University |
| Ge, Peng | Zhejiang University |
| Cao, Junjie | Institute of Cyber Systems and Control, Zhejiang University |
| Yang, Yifan | ZheJiang University |
| Liu, Yong | Zhejiang University |
Keywords: Multi-Robot Systems, Collision Avoidance, Autonomous Agents
Abstract: This paper examines a pursuit-evasion game (PEG) involving multiple pursuers and evaders. The decentralized pursuers aim to collaborate to capture the faster evaders while avoiding collisions. The policies of all agents are learning-based and are subjected to kinematic constraints that are specific to unicycles. To address the challenge of high dimensionality encountered in large-scale scenarios, we propose a state processing method named Mix-Attention, which is based on Self-Attention. This method effectively mitigates the curse of dimensionality. The simulation results provided in this study demonstrate that the combination of Mix-Attention and Independent Proximal Policy Optimization (IPPO) surpasses alternative approaches when solving the multi-pursuer multi-evader PEG, particularly as the number of entities increases. Moreover, the trained policies showcase their ability to adapt to scenarios involving varying numbers of agents and obstacles without requiring retraining. This adaptability showcases their transferability and robustness. Finally, our proposed approach has been validated through physical experiments conducted with six robots.
|
| |
| 15:30-17:00, Paper MoBIP-03.7 | Add to My Program |
| A Gaussian Variational Inference Approach to Motion Planning |
|
| Yu, Hongzhe | Georgia Institute of Technology |
| Chen, Yongxin | Georgia Institute of Technology |
Keywords: Motion and Path Planning, Planning under Uncertainty, Optimization and Optimal Control
Abstract: We propose a Gaussian variational inference framework for the motion planning problem. In this framework, motion planning is formulated as an optimization over the distribution of the trajectories to approximate the desired trajectory distribution by a tractable Gaussian distribution. Equivalently, the proposed framework can be viewed as a standard motion planning with an entropy regularization. Thus, the solution obtained is a transition from an optimal deterministic solution to a stochastic one, and the proposed framework can recover the deterministic solution by controlling the level of stochasticity. To solve this optimization, we adopt the natural gradient descent scheme. The sparsity structure of the proposed formulation induced by factorized objective functions is further leveraged to improve the scalability of the algorithm. We evaluate our method on several robot systems in simulated environments, and show that it achieves collision avoidance with smooth trajectories, and meanwhile brings robustness to the deterministic baseline results, especially in challenging environments and tasks.
|
| |
| 15:30-17:00, Paper MoBIP-03.8 | Add to My Program |
| Exploring Social Motion Latent Space and Human Awareness for Effective Robot Navigation in Crowded Environments |
|
| Ansari, Junaid Ahmed | Tata Consultancy Services |
| Tourani, Satyajit | TCS |
| Kumar, Gourav | Tata Consultancy Services, Kolkata , India |
| Bhowmick, Brojeshwar | Tata Consultancy Services |
Keywords: Collision Avoidance, Social HRI, Machine Learning for Robot Control
Abstract: This work proposes a novel approach to social robot navigation by learning to generate robot controls from a social motion latent space. By leveraging this social motion latent space, the proposed method achieves significant improvements in social navigation metrics such as success rate, navigation time, and trajectory length while producing smoother (less jerk and angular deviations) and more anticipatory trajectories. The superiority of the proposed method is demonstrated through comparison with baseline models in various scenarios. Additionally, the concept of humans� awareness towards the robot is introduced into the social robot navigation framework showing that incorporating human awareness leads to shorter and smoother trajectories owing to humans� ability to positively interact with the robot.
|
| |
| 15:30-17:00, Paper MoBIP-03.9 | Add to My Program |
| DS-MPEPC: Safe and Deadlock-Avoiding Robot Navigation in Cluttered Dynamic Scenes |
|
| Arul, Senthil Hariharan | University of Maryland, College Park |
| Park, Jong Jin | Amazon Lab126 |
| Manocha, Dinesh | University of Maryland |
Keywords: Motion and Path Planning, Collision Avoidance
Abstract: We present an algorithm for safe robot navigation in complex dynamic environments using a variant of model predictive equilibrium point control. We use an optimization formulation to navigate robots gracefully in dynamic environments by optimizing over a trajectory cost function at each timestep. We present a novel trajectory cost formulation that significantly reduces conservative and deadlocking behaviors and generates smooth trajectories. In particular, we propose a new collision probability function that effectively captures the risk associated with a given configuration and the time to avoid collisions based on the velocity direction. Moreover, we propose a terminal state cost based on the expected time-to-goal and time-to-collision values that helps in avoiding trajectories that could result in deadlock. We evaluate our cost formulation in multiple simulated scenarios, including narrow corridors with dynamic obstacles, and observe significantly improved navigation behavior and reduced deadlocks as compared to prior methods.
|
| |
| 15:30-17:00, Paper MoBIP-03.10 | Add to My Program |
| 3D-Online Generalized Sensed Shape Expansion: A Probabilistically Complete Motion Planner in Obstacle-Cluttered Unknown Environments |
|
| Zinage, Vrushabh | University of Texas at Austin |
| Arul, Senthil Hariharan | University of Maryland, College Park |
| Manocha, Dinesh | University of Maryland |
| Ghosh, Satadal | Indian Institute of Technology Madras |
Keywords: Collision Avoidance, Motion and Path Planning, Simulation and Animation
Abstract: We present an online motion planning algorithm (3D-OGSSE) for generating smooth, collision-free trajectories over multiple planning iterations for a 3-D agent operating in an unknown, obstacle-cluttered, 3-D environment. In each planning iteration, 3D-OGSSE constructs an obstacle-free region termed `generalized sensed shape' based on the locally-sensed environment information and the notion of generalized shape. A collision-free path is computed by sampling points in the generalized sensed shape and is used to generate a smooth, time-parametrized trajectory by minimizing snap. The generated trajectory at every planning iteration is constrained to lie within generalized sensed shape, which ensures the agent maneuvers in locally obstacle-free space. As the agent reaches the boundary of the generalized sensed shape in a planning iteration, a re-plan is triggered by a receding horizon planning mechanism that also enables the initialization of the next planning iteration. We also present theoretical guarantee for probabilistic completeness of the developed algorithm over the entire environment and for completely collision-free trajectory generation. We evaluate the proposed method in simulation on complex 3-D environments with varied obstacle-densities. Further, we also evaluate in scenarios with sensor noise and constraints on on-board sensor's field-of-view (FOV). We observe that each planning iteration computation takes approximately 14 milliseconds on a single thread of an Intel Core i5-8500 3.0 GHz CPU, which is significantly faster than several existing algorithms. In addition, we also observe 3D-OGSSE to be less conservative in complex scenarios such as narrow passages.
|
| |
| 15:30-17:00, Paper MoBIP-03.11 | Add to My Program |
| Safe and Efficient Dynamic Window Approach for Differential Mobile Robots with Stochastic Dynamics Using Deterministic Sampling |
|
| Yasuda, Shinya | NEC Corporation |
| Kumagai, Taichi | NEC Corporation |
| Yoshida, Hiroshi | NEC Corporation |
Keywords: Collision Avoidance, Planning under Uncertainty, Motion and Path Planning
Abstract: We propose an efficient and safe dynamic window approach (DWA) by using deterministic sampling. When the system dynamics have uncertainty, the control input includes errors, so that the DWA objective function becomes a random variable. When a random-choice algorithm with a finite number of samples is used to estimate the objective function, it may miss collisions during prediction. In this work, we approximate the end-state distribution as a one-dimensional distribution for each input candidate in advance and generate sample paths deterministically to eliminate the misses to achieve safe control. Numerical experiments have shown that this method is approximately three times as efficient as the Monte Carlo method in most indoor environments.
|
| |
| 15:30-17:00, Paper MoBIP-03.12 | Add to My Program |
| Path Re-Planning Design of a Cobot in a Dynamic Environment Based on Current Obstacle Configuration |
|
| Lee, Chuan-Che | National Yang Ming Chiao Tung University |
| Song, Kai-Tai | National Yang Ming Chiao Tung University |
Keywords: Collision Avoidance, Motion and Path Planning, Human-Robot Collaboration
Abstract: This study proposes a path planning algorithm to generate a collision-free path that avoids static and dynamic obstacles in real time. An efficient path re-planning method is presented for obstacle avoidance for a cobot in an environment that is shared by humans and robot. Static and dynamic obstacles are tracked when the manipulator executes a trajectory along a planned initial static path. When a dynamic obstacle enters the robot�s workspace, the proposed method re-plans a collision-free local path to avoid static and dynamic obstacles. To allow fast local re-planning, a hybrid method that combines the advantages of APF and RRT path planning algorithm is proposed. The weight factors for the hybrid method are determined according to the current configuration of obstacles. The experimental results for a TM5-700 manipulator show that the proposed method decreases re-planning time and path length in an environment with static and dynamic obstacles. The path re-planning time is at least 55% less than those for two existent path planning optimization methods D-RRT and VF-RRT.
|
| |
| 15:30-17:00, Paper MoBIP-03.13 | Add to My Program |
| DRL-VO: Learning to Navigate through Crowded Dynamic Scenes Using Velocity Obstacles (I) |
|
| Xie, Zhanteng | Temple University |
| Dames, Philip | Temple University |
Keywords: Collision Avoidance, Deep Learning in Robotics and Automation, Field Robots, Reactive and Sensor-Based Planning
Abstract: This paper proposes a novel learning-based control policy with strong generalizability to new environments that enables a mobile robot to navigate autonomously through spaces filled with both static obstacles and dense crowds of pedestrians. The policy uses a unique combination of input data to generate the desired steering angle and forward velocity: a short history of lidar data, kinematic data about nearby pedestrians, and a sub-goal point. The policy is trained in a reinforcement learning setting using a reward function that contains a novel term based on velocity obstacles to guide the robot to actively avoid pedestrians and move towards the goal. Through a series of 3D simulated experiments with up to 55 pedestrians, this control policy is able to achieve a better balance between collision avoidance and speed (i.e., higher success rate and faster average speed) than state-of-the-art model-based and learning-based policies, and it also generalizes better to different crowd sizes and unseen environments. An extensive series of hardware experiments demonstrate the ability of this policy to directly work in different real-world environments with different crowd sizes with zero re
|
| |
| MoBIP-04 Regular session, Hall E |
Add to My Program |
| Clone of 'Motion Control' |
|
| |
| |
| 15:30-17:00, Paper MoBIP-04.1 | Add to My Program |
| Model Predictive Control of Autonomous Vehicles with Integrated Barriers Using Occupancy Grid Maps |
|
| Cho, Minsu | Korea Advanced Institute of Science and Techonology |
| Lee, Yeongseok | Korea Advanced Institute of Science and Technology |
| Kim, Kyung-Soo | KAIST(Korea Advanced Institute of Science and Technology) |
Keywords: Integrated Planning and Control, Motion and Path Planning, Autonomous Vehicle Navigation
Abstract: Nonlinear model predictive control (NMPC) is an efficient and proven method for optimization-based autonomous vehicle motion planning. Among the various approaches, the iterative linear quadratic regulator, a differential dynamic programming variant, is a well-known efficient nonlinear optimization method. In safety-critical control systems, controllers should address inequality-constrained optimization problems. In this work, we design a single unified constraint using an occupancy grid map. We convert this inequality-constrained optimization problem into an unconstrained optimization problem by appending a single integrated discrete barrier state to the system model. This approach simplifies complex motion planning problems and reduces computational costs. In the proposed method, we first discretize the surrounding environment with an occupancy grid map and design a single constraint that ensures that only cells with values less than a predefined threshold can be traversed by the ego vehicle. Then, we define a single integrated discrete barrier state to introduce this constraint into the motion planning algorithm. The proposed method, a penalty method, and the augmented Lagrangian method are tested on a real-time software-in-the-loop simulation using CarMaker and ROS. The simulation results of pop-up obstacle avoidance scenarios show the benefits of the proposed method, such as reduced time costs and increased robustness.
|
| |
| 15:30-17:00, Paper MoBIP-04.2 | Add to My Program |
| Path-Following Control with Path and Orientation Snap-In |
|
| Hartl-Nesic, Christian | TU Wien |
| Pritzi, Elias | TU Wien |
| Kugi, Andreas | TU Wien |
Keywords: Human-Robot Collaboration, Compliance and Impedance Control, Industrial Robots
Abstract: Robots need to be as simple to use as tools in a workshop and allow non-experts to program, modify and execute tasks. In particular for repetitive tasks in high-mix/low-volume production, robotic support and physical human-robot interaction (pHRI) help to significantly increase productivity. In path-following control (PFC), the geometric description of the path is decoupled from the time evolution of the robot's end-effector along the path. PFC is inherently suitable for pHRI since path progress can be derived from the interaction with the human. In this work, an extension to multi-path PFC is proposed, which allows smooth transitions between the paths initiated by the human. Additionally, two pHRI modes called path snap-in and orientation snap-in are proposed, which use attractive forces to snap the robot end-effector onto a path or a predefined orientation. Moreover, the stability properties of PFC are inherited and the method is applicable to linear, nonlinear and self-intersecting paths. The proposed pHRI modes are validated on an experimental drilling task for teach-in (using orientation snap-in) and execution (using path snap-in) with the kinematically redundant collaborative robot KUKA LBR iiwa 14 R820.
|
| |
| 15:30-17:00, Paper MoBIP-04.3 | Add to My Program |
| Design and Control of a Reluctance-Based Micropositioning Stage for Scanning Motion Applications |
|
| Al Saaideh, Mohammad | Memorial University of Newfoundland |
| Alatawneh, Natheer | Cysca Technologies |
| Aljanaideh, Khaled | Jordan University of Science and Technology |
| Al Janaideh, Mohammad | University of Guelph |
Keywords: Motion Control
Abstract: This paper presents a design and characterization of a micropositioning stage driven by a reluctance actuator. The stage is constructed with a C-core reluctance actuator and four compression springs. The design of the stage is presented using a CAD model, followed by the fabrication process of the prototype. The mathematical model is formulated to present the interaction among the stage's electrical, magnetic, and mechanical dynamic behaviour. Next, the force-current and force-gap characteristics are obtained by measuring the force under different applied currents and air gaps. After that, the system is analyzed to determine the maximum applied voltage that stabilizes the system in an open-loop configuration, followed by the time-domain and frequency-domain response. Finally, the feedforward controller is presented to linearize the dynamic behavior of the stage over a specific range of motion. The experimental results under the feedforward controller show a linear characteristic between the desired force and the output displacement.
|
| |
| 15:30-17:00, Paper MoBIP-04.4 | Add to My Program |
| Body Posture Controller for Actively Articulated Tracked Vehicles Moving Over Rough and Unknown Terrains |
|
| Santos Rocha, Filipe Augusto | COPPE / Federal University of Rio De Janeiro (UFRJ) |
| Cid, Andr� | Instituto Tecnologico Vale |
| Delunardo, Mario | Instituto Tecnologico Vale |
| P. Junior, Renato | Instituto Tecnologico Vale |
| Costa Pereira de S. Thiago Neto, Nilton | Universidade Federal De Ouro Preto |
| Barros, Luiz | Instituto Tecnologico Vale |
| D. Domingues, Jaco | Instituto Tecnologico Vale |
| Pessin, Gustavo | Instituto Tecnol�gico Vale |
| Freitas, Gustavo | Federal University of Minas Gerais |
| Costa, Ramon | Federal University of Rio De Janeiro |
Keywords: Motion Control, Kinematics, Field Robots
Abstract: Terrestrial mobile robots face diverse topogra- phies while in field missions. Rough terrains cause the platform to oscillate, which is undesirable for some tasks. Robotic platforms with active tracked flippers can use such mechanisms to reach and maintain a leveled configuration while halted or moving. Thus, this work presents a posture controller that regulates the robot�s orientation and contact plane clearance using flippers while the robot moves over unknown, uneven ground. The method takes as input the flippers� joint position, torque, and the robot chassis orientation, outputting as the command signal the flippers� joint velocities. Based on Stewart platforms, a differential kinematics model relates desired plat- form�s motion to flippers� frame velocities. Later, a flippers- ground interaction model transforms their frames� computed velocities to flippers� joint speed commands. The controller is based on dual-quaternion algebra for generating the error signal. The efficacy of the proposed controller is evaluated experimentally in an industrial robotic platform moving as it moves along an open field track. The method successfully regulates the robot�s posture while navigating over non-modeled rough terrain.
|
| |
| 15:30-17:00, Paper MoBIP-04.5 | Add to My Program |
| Exploring Learning-Based Control Policy for Fish-Like Robots in Altered Background Flows |
|
| Lin, Xiaozhu | ShanghaiTech University |
| Song, Wenbin | ShanghaiTech University |
| Liu, Xiaopei | SHANGHAITECH UNIVERSITY |
| He, Xuming | ShanghaiTech University |
| Wang, Yang | Shanghaitech University |
Keywords: Motion Control, Biologically-Inspired Robots, Reinforcement Learning
Abstract: The study of motion control for the fish-like robots in complex fluid fields is of great importance in improving the performance of underwater vehicles, due to its strong maneuverability, propulsion efficiency, and deceptive visual appearance. In this article, a novel learning-based control framework is first proposed to autonomously explore efficient control policies that are capable of performing motion control tasks in non-quiescent and unknown background flows. First, we utilize a high-fidelity simulation system, named FishGym, to generate various uniform flows. Next, a DRL-based algorithm is incorporated with the FishGym to train the fish-like robot to control its motion to optimally complete a delicately designed task (Approaching Target and Stay) in both quiescent and uniform flow. Then, the obtained control policy together with an online estimator is directly applied to a Path-Following Task. The proposed framework well balances the simulation accuracy and the computational efficiency, which is of crucial importance for effective coupling with the learning algorithm. The simulation results indicate that, via the proposed learning framework, the robot successfully acquired a swimming strategy that can be used to adapt to different background flows and tasks. Furthermore, we also observe some adaptation behavior of the robot, such as rheotaxis, that is similar to the fish in nature, which gains us more insight into the mechanism underlying the adaptation behavior of fish in a complex environment.
|
| |
| 15:30-17:00, Paper MoBIP-04.6 | Add to My Program |
| On the Design of Region-Avoiding Metrics for Collision-Safe Motion Generation on Riemannian Manifolds |
|
| Klein, Holger | Karlsruhe Institute of Technology |
| Jaquier, No�mie | Karlsruhe Institute of Technology |
| Meixner, Andre | Karlsruhe Institute of Technology (KIT) |
| Asfour, Tamim | Karlsruhe Institute of Technology (KIT) |
Keywords: Motion Control, Human and Humanoid Motion Analysis and Synthesis, Dynamics
Abstract: The generation of energy-efficient and dynamic-aware robot motions that satisfy constraints such as joint limits, self-collisions, and collisions with the environment remains a challenge. In this context, Riemannian geometry offers promising solutions by identifying robot motions with geodesics on the so-called configuration space manifold. While this manifold naturally considers the intrinsic robot dynamics, constraints such as joint limits, self-collisions, and collisions with the environment remain overlooked. In this paper, we propose a modification of the Riemannian metric of the configuration space manifold allowing for the generation of robot motions as geodesics that efficiently avoid given regions. We introduce a class of Riemannian metrics based on barrier functions that guarantee strict region avoidance by systematically generating accelerations away from no-go regions in joint and task space. We evaluate the proposed Riemannian metric to generate energy-efficient, dynamic-aware, and collision-free motions of a humanoid robot as geodesics and sequences thereof.
|
| |
| 15:30-17:00, Paper MoBIP-04.7 | Add to My Program |
| Towards Connecting Control to Perception: High-Performance Whole-Body Collision Avoidance Using Control-Compatible Obstacles |
|
| Eckhoff, Moritz | Technical University of Munich (TUM) |
| Knobbe, Dennis | Technical University of Munich (TUM) |
| Zwirnmann, Henning | Technical University of Munich |
| Swikir, Abdalla | Technical University of Munich |
| Haddadin, Sami | Technical University of Munich |
Keywords: Whole-Body Motion Planning and Control, Force Control, Multi-Modal Perception for HRI
Abstract: One of the most important aspects of autonomous systems is safety. This includes ensuring safe human-robot and safe robot-environment interaction when autonomously performing complex tasks or in collaborative scenarios. Although several methods have been introduced to tackle this, most are unsuitable for real-time applications and require carefully hand-crafted obstacle descriptions. In this work, we propose a method combining high-frequency and real-time self and environment collision avoidance of a robotic manipulator with low-frequency, multimodal, and high-resolution environmental perceptions accumulated in a digital twin system. Our method is based on geometric primitives, so-called primitive skeletons. These, in turn, are information-compressed and real-time compatible digital representations of the robot's body and environment, automatically generated from ultra-realistic virtual replicas of the real world provided by the digital twin. Our approach is a key enabler for closing the loop between environment perception and robot control by providing the millisecond real-time control stage with a current and accurate world description, empowering it to react to environmental changes. We evaluate our whole-body collision avoidance on a 9-DOFs robot system through five experiments, demonstrating the functionality and efficiency of our framework.
|
| |
| 15:30-17:00, Paper MoBIP-04.8 | Add to My Program |
| Real-Time Whole-Body Collision Avoidance and Path Following of a Snake Robot through MPC-Based Optimization Strategies |
|
| Wang, Liuyin | University of Shanghai for Science and Technology |
| Wang, Gang | University of Nevada |
| Li, Yuan | University of Shanghai for Science and Technology |
| Li, Peng | Harbin Institute of Technology ShenZhen |
| Ji, Yunfeng | University of Shanghai for Science and Technology |
| Wang, Chaoli | University of Shanghai for Science and Technology |
| Shen, Yantao | University of Nevada, Reno |
Keywords: Redundant Robots, Motion Control, Actuation and Joint Mechanisms
Abstract: The work in this paper delves into the challenge of whole elongated body�s obstacle avoidance during path following for a class of bionic snake robots. Currently, most studies focus solely on preventing the robot�s head from colliding with obstacles through designed controllers. However, due to the unique elongated structure and biomimetic locomotion modes of snake robots, it is unavoidable that the rest of the robot�s body could still collide with obstacles. To resolve this problem, we propose a novel real-time optimization obstacle avoidance strategy for a class of terrestrial snake robots with multi-link elongated body using model predictive control (MPC). Moreover, by leveraging the elongated body characteristics of the robot, an improved path guidance strategy is also developed. The effectiveness of the proposed strategies is verified and validated through extensive simulations and experiments on a custom-built nine-link elongated snake robot. The results demonstrate that all links of the robot can well avoid obstacles while continuing to track the given path.
|
| |
| 15:30-17:00, Paper MoBIP-04.9 | Add to My Program |
| Safety-Critical Coordination for Cooperative Legged Locomotion Via Control Barrier Functions |
|
| Kim, Jeeseop | Caltech |
| Lee, Jaemin | California Institute of Technology |
| Ames, Aaron | Caltech |
Keywords: Motion Control, Legged Robots, Robot Safety
Abstract: This paper presents a safety-critical approach to the coordinated control of cooperative robots locomoting in the presence of fixed (holonomic) constraints. To this end, we leverage control barrier functions (CBFs) to ensure the safe cooperation of the robots while maintaining a desired formation and avoiding obstacles. The top-level planner generates a set of feasible trajectories, accounting for both kinematic constraints between the robots and physical constraints of the environment. This planner leverages CBFs to ensure safety-critical coordination control, i.e., guarantee safety of the collaborative robots during locomotion. The middle-level trajectory planner incorporates interconnected single rigid body (SRB) dynamics to generate optimal ground reaction forces (GRFs) to track the safety-ensured trajectories from the top-level planner while addressing the interconnection dynamics between agents. Distributed low-level controllers generate whole-body motion to follow the prescribed optimal GRFs while ensuring the friction cone condition at each end of the stance legs. The effectiveness of the approach is demonstrated through numerical simulations and experimentally on a pair of quadrupedal robots.
|
| |
| 15:30-17:00, Paper MoBIP-04.10 | Add to My Program |
| Staged Contact Optimization: Combining Contact-Implicit and Multi-Phase Hybrid Trajectory Optimization |
|
| Turski, Michael R. | Carnegie Mellon University |
| Norby, Joseph | Apptronik |
| Johnson, Aaron M. | Carnegie Mellon University |
Keywords: Multi-Contact Whole-Body Motion Planning and Control, Legged Robots, Optimization and Optimal Control
Abstract: Trajectory optimization problems for legged robots are commonly formulated with fixed contact schedules. These multi-phase Hybrid Trajectory Optimization (HTO) methods result in locally optimal trajectories, but the result depends heavily upon the predefined contact mode sequence. Contact-Implicit Optimization (CIO) offers a potential solution to this issue by allowing the contact mode to be determined throughout the trajectory by the optimization solver. However, CIO suffers from long solve times and convergence issues. This work combines the benefits of these two methods into one algorithm: Staged Contact Optimization (SCO). SCO tightens constraints on contact in stages, eventually fixing them to allow robust and fast convergence to a feasible solution. Results on a planar biped and spatial quadruped demonstrate speed and optimality improvements over CIO and HTO. These properties make SCO well suited for offline trajectory generation or as an effective tool for exploring the dynamic capabilities of a robot.
|
| |
| 15:30-17:00, Paper MoBIP-04.11 | Add to My Program |
| Hierarchical Relaxation of Safety-Critical Controllers: Mitigating Contradictory Safety Conditions with Application to Quadruped Robots |
|
| Lee, Jaemin | California Institute of Technology |
| Kim, Jeeseop | Caltech |
| Ames, Aaron | Caltech |
Keywords: Motion Control, Legged Robots, Robot Safety
Abstract: The safety-critical control of robotic systems often must account for multiple, potentially conflicting, safety constraints. This paper proposes novel relaxation techniques to address safety-critical control problems in the presence of conflicting safety conditions. In particular, Control Barrier Functions (CBFs) provide a means to encode safety as constraints in a Quadratic Program (QP), wherein multiple safety conditions yield multiple constraints. However, the QP problem becomes infeasible when the safety conditions cannot be simultaneously satisfied. To resolve this potential infeasibility, we introduce a hierarchy between the safety conditions and employ an additional variable to relax the less important safety conditions (Relaxed-CBF-QP). We also formulate a cascaded structure to achieve smaller violations of lower-priority safety conditions (Hierarchical-CBF-QP). The proposed approach, therefore, ensures the existence of at least one solution to the QP problem with the CBFs while dynamically balancing enforcement of additional safety constraints. Importantly, this paper evaluates the impact of different weighting factors in the Hierarchical-CBF-QP and, due to the sensitivity of these weightings in the observed behavior, proposes a method to determine the weighting factors via a sampling-based technique. The validity of the proposed approach is demonstrated through simulations and experiments on a quadrupedal robot navigating to a goal through regions with different levels of danger.
|
| |
| 15:30-17:00, Paper MoBIP-04.12 | Add to My Program |
| A Recursive Lie-Group Formulation for the Second-Order Time Derivatives of the Inverse Dynamics of Parallel Kinematic Manipulators |
|
| Mueller, Andreas | Johannes Kepler University |
| Kumar, Shivesh | DFKI GmbH |
| Kordik, Thomas | Johannes Kepler University, Institute of Robotics |
Keywords: Parallel Robots, Compliant Joints and Mechanisms, Motion Control
Abstract: Series elastic actuators (SEA) were introduced for serial robotic arms. Their model-based trajectory tracking control requires the second time derivatives of the inverse dynamics solution, for which algorithms were proposed. Trajectory control of parallel kinematics manipulators (PKM) equipped with SEAs has not yet been pursued. Key element for this is the computationally efficient evaluation of the second time derivative of the inverse dynamics solution. This has not been presented in the literature, and is addressed in the present paper for the first time. The special topology of PKM is exploited reusing the recursive algorithms for evaluating the inverse dynamics of serial robots. A Lie group formulation is used and all relations are derived within this framework. Numerical results are presented for a 6-DOF Gough-Stewart platform (as part of an exoskeleton), and for a planar PKM when a flatness-based control scheme is applied.
|
| |
| 15:30-17:00, Paper MoBIP-04.13 | Add to My Program |
| Manipulator Differential Kinematics Part I: Kinematics, Velocity, and Applications (I) |
|
| Haviland, Jesse | Queensland University of Technology |
| Corke, Peter | Queensland University of Technology |
Keywords: Kinematics, Motion Control, Manipulation Planning
Abstract: Manipulator kinematics is concerned with the motion of each link within a manipulator without considering mass or force. In this article, which is the first in a two-part tutorial, we provide an introduction to modelling manipulator kinematics using the elementary transform sequence (ETS). Then we formulate the first-order differential kinematics, which leads to the manipulator Jacobian, which is the basis for velocity control and inverse kinematics. We describe essential classical techniques which rely on the manipulator Jacobian before exhibiting some contemporary applications. Part II of this tutorial provides a formulation of second and higher-order differential kinematics, introduces the manipulator Hessian, and illustrates advanced techniques, some of which improve the performance of techniques demonstrated in Part I.
|
| |
| MoBIP-05 Regular session, Hall E |
Add to My Program |
| Clone of 'Mechanism Design II' |
|
| |
| |
| 15:30-17:00, Paper MoBIP-05.1 | Add to My Program |
| Compliant Suction Gripper with Seamless Deployment and Retraction for Robust Picking against Depth and Tilt Errors |
|
| Yoo, Yuna | Seoul National University |
| Eom, Jaemin | Seoul National University Biorobotics Lab |
| Park, Min Jo | Seoul National University |
| Cho, Kyu-Jin | Seoul National University, Biorobotics Laboratory |
Keywords: Mechanism Design, Soft Robot Applications, Grippers and Other End-Effectors
Abstract: Applying suction grippers in unstructured environments is a challenging task because of depth and tilt errors in vision systems, requiring additional costs in elaborate sensing and control. To reduce additional costs, suction grippers with compliant bodies or mechanisms have been proposed; however, their bulkiness and limited allowable error hinder their use in complex environments with large errors. Here, we propose a compact suction gripper that can pick objects over a wide range of distances and tilt angles without elaborate sensing and control. The spring-inserted gripper body deploys and conforms to distant and tilted objects until the suction cup completely seals with the object and retracts immediately after, while holding the object. This seamless deployment and retraction is enabled by connecting the gripper body and suction cup to the same vacuum source, which couples the vacuum picking and retraction of the gripper body. Experimental results validated that the proposed gripper can pick objects within 79 mm, which is 1.4 times the initial length, and can pick objects with tilt angles up to 60�. The feasibility of the gripper was verified by demonstrations, including picking objects of different heights from the same picking height and the bin picking of transparent objects.
|
| |
| 15:30-17:00, Paper MoBIP-05.2 | Add to My Program |
| Design of Novel Knee Joint Mechanism of Lower-Limb Exoskeleton to Realize Spatial Motion of Human Knee |
|
| Hong, Man Bok | Agency for Defense Development |
| Kim, Yongcheol | Agency for Defense Development |
| Kim, Gwang Tae | Agency for Defense Development |
| Lee, Myunghyun | Agency for Defense Development |
| Kim, Seonwoo | Agency for Defense Development |
Keywords: Kinematics, Prosthetics and Exoskeletons, Mechanism Design
Abstract: The rotation axis of human knee joint varies according to knee flexion angles. That is, human knee movement is spatial with dominant flexional rotation. Knee joints of most lower-limb exoskeletons were, however, realized with a simple revolute pair for design simplicity. Wearing the knee joint with a simple revolute pair constrains inevitably natural parasitic motion of human knee joint. Rigid constraints imposed due to the simple knee joint lowers wearability and comfortability. In addition, it may act as potential risks to harm the knee joint structure of the wearer. The concept of polycentric knee is a well-known approach to mimic the variation of knee rotation center. Polycentric knees, however, realize only the planar trace of projected points of rotation axis. That is, parasitic knee rotations of varus and internal rotation occuring naturally during knee flexion cannot be realized by polycentrc knees. In order to resolve this, a novel spherical knee joint of exoskeleton is introduced in this paper to realize knee spherical movements. For the design, change in instantaneous rotation axes during knee flexion is formulated to a spherical trace as a function of knee flexion angles. A spherical four-bar linkage is suggested to realize the axis trajectory, not the point trace. A method to find instantaneous rotation axis and rotation matrix of coupler link is derived, when the angle of input link is given. Using the kinematic relations, kinematic parameters of the mechanism are optimized to minimize angle difference between the instantaneous axis of coupler and the required axis of knee rotation. Finally, based on the synthesized kinematic parameters, a prototype design was introduced in this paper.
|
| |
| 15:30-17:00, Paper MoBIP-05.3 | Add to My Program |
| A Novel Coiled Cable-Conduit-Driven Hyper-Redundant Manipulator for Remote Operating in Narrow Spaces |
|
| Luo, Mingrui | Institute of Automation, Chinese Academy of Sciences |
| Tian, Yunong | Institute of Automation, Chinese Academy of Sciences |
| Li, En | Institute of Automation, Chinese Academy of Sciences |
| Chen, Minghao | Institute of Automation, Chinese Academy of Sciences |
| Kang, Cunfeng | Beijing University of Technology |
| Yang, Guodong | Institute of Automation, Chinese Academy of Sciences |
| Tan, Min | Institute of Automation, Chinese Academy of Sciences |
Keywords: Redundant Robots, Tendon/Wire Mechanism, Telerobotics and Teleoperation
Abstract: Operating in narrow spaces is an important challenge in the development of robots. Redundant manipulators are one way to solve this problem, but their mechanism design and control method still have much room for improvement. In this paper, we propose a coiled cable-conduit-driven hyper-redundant manipulator (C-CDHRM) with great slenderness and flexibility. In terms of mechanism design, it considers both compactness and operability. By imitating the structure and behavior of a constricting snake, it can be uncoiled sequentially from a coiled storage state, led by the head. In terms of control methods, we propose a multi-layer control system that can make remote operations more accurate and reliable. On the one hand, guiding, segmenting, and following the path overcome the planning ambiguity caused by redundancy. On the other hand, conduit transmission modeling and cable length correction overcome the nonlinear mapping of cable-driven joints and were verified in experiments. Through tests, the mobile integrated system composed of C-CDHRM has an excellent performance in operation precision and accuracy, ensuring safety and accessibility in narrow spaces. Finally, in field experiments, the inspection and cleaning of various types of electrical equipment have been successfully completed, showing excellent application prospects.
|
| |
| 15:30-17:00, Paper MoBIP-05.4 | Add to My Program |
| Design and Testing of a Flexure-Based XYZ Micropositioner with High Space-Utilization Efficiency |
|
| Lyu, Zekui | University of Macau |
| Xu, Qingsong | University of Macau |
Keywords: Compliant Joints and Mechanisms, Grippers and Other End-Effectors, Automation at Micro-Nano Scales
Abstract: The flexure-based XYZ micropositioner with hybrid configuration has become more prevalent due to the characteristics of less mechanism decoupling and high motion precision. However, traditional mechanism design suffers from a large plane occupation with Z stage stacking, which leads to a low space-utilization efficiency. To address this issue, a novel conceptual design is proposed in this paper by integrating a spatially structured XY stage and an embedded Z stage together. After completing the design of the mechanism, the driving stiffness of the stage in three axes is evaluated by the mechanics analysis. Then, the model is verified by performing finite element analysis simulation study and experimental test. The theoretical model, simulation results, and experimental data indicate a good agreement. Experimental results show that the proposed flexure-based XYZ micropositioner can deliver a stroke of 4.15 mm x 4.06 mm x 0.04 mm with a physical size of 116 mm x 116 mm x 45 mm. The performance comparison reveals that it has a superior space-utilization efficiency. In consideration of the feasibility of the proposed conceptual design, it provides a reference for diversified and refined design of XYZ micropositioners.
|
| |
| 15:30-17:00, Paper MoBIP-05.5 | Add to My Program |
| Design and Development of a Deformable In-Pipe Inspection Robot for Various Diameter Pipes |
|
| Xu, Huafeng | The Hong Kong Polytechnic University |
| Cao, Jiannong | The Hong Kong Polytechnic University |
| Cheng, Zhiqin | The Hong Kong Polytechnic University |
| Liang, Zhixuan | The Hong Kong Polytechnic University |
| Chen, Jinlin | Hong Kong Polytechnic University |
Keywords: Wheeled Robots, Mechanism Design, Kinematics
Abstract: Pipelines have become one of the most important infrastructures in the city. Over time, they are prone to aging, cracks, corrosion, and the need for regular inspection is gradually increasing. Robotic solutions are effective methods for in-pipe inspection. However, existing In-pipe Inspection Robots (IPIR) require that the inner diameter of the pipe is fixed in the application scenarios, and need extra labor to control the robot and handle the cable. In this work, we design and develop a deformable robot to adapt to pipes with different inner diameters. Specifically, the passive elastic hinge is used by us to make the robot fully in contact with the pipe, generating enough friction to ensure that the robot is attached to the inner wall of the pipe. An edge device is deployed on the robot, generating velocity commands of wheels through the data from Inertial Measurement Unit (IMU), which eliminates the need for external devices. Experimental results demonstrate that the robot can move in horizontal and vertical pipelines, as well as traverse through pipe joints and scenarios where there is dirty or small obstacle.
|
| |
| 15:30-17:00, Paper MoBIP-05.6 | Add to My Program |
| A Bioinspired Underactuated Dual Tendon-Based Adaptive Gripper for Space Applications |
|
| Isakhani, Hamid | University of Birmingham |
| Nefti-Meziani, Samia | University of Salford |
| Davis, Steven | University of Birmingham |
| Isakhani, Helya | Rebelya LTD |
Keywords: Tendon/Wire Mechanism, Space Robotics and Automation, Additive Manufacturing
Abstract: Hands are one of the most intricate elements of a humanoid due to their role as end-effectors interacting with their non-linear surrounding environment. This paper aims to present the design of a bioinspired underactuated robotic hand with an improved dexterity that is capable of adaptive grasping and manipulation of a wide-range of objects using a dual-tendon mechanism. The proposed design is focused on the key elements of scalability, modularity, ease of fabrication and cost efficiency to meet several imperative constraints of space applications. These features are achieved by introducing a novel actuation mechanism, manufacturing methods, and component design. In particular, monolithic finger modules are fabricated by fusing and integrating both hard and soft materials analogous to bones wrapped in muscles using economical and readily-available materials and machines (intermediate 3D printer). Weight-to-power ratio, actuation optimisation, design trade-offs, and various potential applications of the proposed adaptive hand is discussed in this paper. Furthermore, the prototype is subjected to evaluation of its performance in different scenarios that ultimately confirms its improved dexterity and gripping power compared to the literature.
|
| |
| 15:30-17:00, Paper MoBIP-05.7 | Add to My Program |
| Parallel-Jaw Gripper and Grasp Co-Optimization for Sets of Planar Objects |
|
| Jiang, Rebecca H. | Massachusetts Institute of Technology |
| Doshi, Neel | MIT |
| Gondhalekar, Ravi | The Charles Stark Draper Laboratory |
| Rodriguez, Alberto | Massachusetts Institute of Technology |
Keywords: Grippers and Other End-Effectors, Grasping, Methods and Tools for Robot System Design
Abstract: We propose a framework for optimizing a planar parallel-jaw gripper for use with multiple objects. While optimizing general-purpose grippers and contact locations for grasps are both well studied, co-optimizing grasps and the gripper geometry to execute them receives less attention. As such, our framework synthesizes grippers optimized to stably grasp sets of polygonal objects. Given a fixed number of contacts and their assignments to object faces and gripper jaws, our framework optimizes contact locations along these faces, gripper pose for each grasp, and gripper shape. Our key insights are to pose shape and contact constraints in frames fixed to the gripper jaws, and to leverage the linearity of constraints in our grasp stability and gripper shape models via an augmented Lagrangian formulation. Together, these enable a tractable nonlinear program implementation. We apply our method to several examples. The first illustrative problem shows the discovery of a geometrically simple solution where possible. In another, space is constrained, forcing multiple objects to be contacted by the same features as each other. Finally a toolset-grasping example shows that our framework applies to complex, real-world objects. We provide a physical experiment of the toolset grasps.
|
| |
| 15:30-17:00, Paper MoBIP-05.8 | Add to My Program |
| Inertial Propulsion Robot Usingthe Shape Characteristics of a Streamlined Body Frame |
|
| Nishihara, Masatsugu | JAIST |
| Asano, Fumihiko | Japan Advanced Institute of Science and Technology |
Keywords: Underactuated Robots, Dynamics, Industrial Robots
Abstract: We have been investigating a crawling-like locomotion robot to make it efficiently slide forward based on a simple system and control mechanisms on a slippery level surface, where the motion of the center of mass plays an important role. In this paper, we induce an effective motion of the center of mass considering a streamlined body shape of a locomotion robot in which a pendulum is installed. First, we derive the equation of motion and a control input to achieve a desired motion of the inner pendulum. Second, we formulate constraint conditions between the streamlined body and a slippery floor. Third, we demonstrate the numerical simulation, and the robot steadily slides forward by adopting the streamlined shape as the body frame. Fourth, we verify the numerical results through experiments, and the experimental results exhibit a similar tendency compared with the numerical results. Fifth, we find a local minimum value of locomotion efficiency based on Bayesian optimization which is a class of machine-learning-based optimization, and we achieve exceedingly efficient locomotion of the robot on the slippery floor at the local minimum in both the simulation and experiment.
|
| |
| 15:30-17:00, Paper MoBIP-05.9 | Add to My Program |
| Two-Stage Trajectory-Tracking Control of Cable-Driven Upper-Limb Exoskeleton Robots with Series Elastic Actuators: A Simple, Accurate, and Force-Sensorless Method |
|
| Shu, Yana | Tsinghua University |
| Chen, Yu | Tsinghua University |
| Zhang, Xuan | Tsinghua University |
| Zhang, Shisheng | Shenyang Jianzhu University |
| Chen, Gong | Shenzhen MileBot Robotics |
| Ye, Jing | Shenzhen MileBot Robotics Co. Ltd |
| Li, Xiang | Tsinghua University |
Keywords: Actuation and Joint Mechanisms, Motion Control, Prosthetics and Exoskeletons
Abstract: The advantages of cable-driven exoskeleton robots with series elastic actuators can be summarized in twofold: 1) the inertia of the robot joint is relatively low, which is more friendly for human-robot interaction; 2) the elastic element is tolerant to impacts and hence provides structural safety. As trade-offs, the overall dynamic model of such a system is of high order and subject to both unmodelled disturbances (due to the cable-driven mechanism) and external torques (due to the human-robot interaction), opening up challenges for the controller development. This paper proposes a new trajectory-tracking control scheme for cable-driven upper-limb exoskeleton robots with series elastic actuators. The control objectives are achieved in two stages: Stage I is to approximate then compensate for unmodelled disturbances with iterative learning techniques; Stage II is to employ a suboptimal model predictive controller to drive the robot to track the desired trajectory. While controlling such a robot is not trivial, the proposed control scheme exhibits the advantages of force-sensorlessness, high accuracy, and low complexity compared with other methods in the real-world experiments.
|
| |
| 15:30-17:00, Paper MoBIP-05.10 | Add to My Program |
| A Retractable Soft Growing Robot with a Flexible Backbone |
|
| Pi, Xinyi | University of Sheffield |
| Szczech, Isabella Ann | The University of Sheffield |
| Cao, Lin | University of Sheffield |
Keywords: Mechanism Design, Soft Robot Materials and Design, Soft Sensors and Actuators
Abstract: Soft-growing robots are emerging with numerous potential applications because of their superior capability of frictionless navigation. However, their success is hindered by their tendency to buckle under the tension required to retract them via inversion. In this paper, we propose a simple and scalable tubular backbone to facilitate retracting the robot body without buckling. With this backbone, compressive forces at the robot's tip are mitigated and a limit is placed on the effective length for retraction during the application of tension. We first present the selection of the backbone and the development of such a retractable soft-growing robot. Along with the characterization of the working principles behind this buckling-free mechanism, success was observed with the use of the backbone in retraction tests. The effects of different parameters such as robot body lengths, air pressures, curvatures, and retraction modes on the performance were also investigated. This backbone approach requires no bulky or in-situ mechatronic components inside the robot body and thus may be used in medical applications which appreciate simple, compact, and in-situ electronic-free designs.
|
| |
| 15:30-17:00, Paper MoBIP-05.11 | Add to My Program |
| CurveQuad: A Centimeter-Scale Origami Quadruped That Leverages Curved Creases to Self-Fold and Crawl with One Motor |
|
| Feshbach, Daniel | University of Pennsylvania |
| Wu, Xuelin | University of Pennsylvania |
| Vasireddy, Satviki | Princeton Day School |
| Beardell, Louis | Episcopal Academy |
| To, Bao | Peddie School |
| Baryshnikov, Yuliy | UIUC |
| Sung, Cynthia | University of Pennsylvania |
Keywords: Underactuated Robots, Soft Robot Materials and Design, Flexible Robotics
Abstract: We present CurveQuad, a miniature curved origami quadruped that is able to self-fold and unfold, crawl, and steer, all using a single actuator. CurveQuad is designed for planar manufacturing, with parts that attach and stack sequentially on a flat body. The design uses 4 curved creases pulled by 2 pairs of tendons from opposite ends of a link on a 270� servo. It is 8 cm in the longest direction and weighs 10.9 g. Rotating the horn pulls the tendons inwards to induce folding. Continuing to rotate the horn shears the robot, enabling the robot to shuffle forward while turning in either direction. We experimentally validate the robot's ability to fold, steer, and unfold by changing the magnitude of horn rotation. We also demonstrate basic feedback control by steering towards a light source from a variety of starting positions and orientations, and swarm aggregation by having 4 robots simultaneously steer towards the light. The results demonstrate the potential of using curved crease origami in self-assembling and deployable robots with complex motions such as locomotion.
|
| |
| 15:30-17:00, Paper MoBIP-05.12 | Add to My Program |
| A Pendulum-Driven Legless Rolling Jumping Robot |
|
| Buzhardt, Jake | Clemson University |
| Chivkula, Prashanth | Clemson University |
| Tallapragada, Phanindra | Clemson University |
Keywords: Underactuated Robots, Dynamics, Passive Walking
Abstract: In this paper, we present a novel rolling, jumping robot. The robot consists of a driven pendulum mounted to a wheel in a compact, lightweight, 3D printed design. We show that by driving the pendulum to shift the robot's weight distribution, the robot is able to obtain significant rolling speed, achieve jumps of up to 2.5 body lengths vertically, and clear horizontal distances of over 6 body lengths. The robot's dynamic model is derived and simulation results indicate that it is consistent with the rolling motion and jumping observed on the robot. The ability to both roll and jump effectively using a minimalistic design makes this robot unique and could inspire the use of similar mechanisms on robots intended for applications in which agile locomotion on unstructured terrain is necessary, such as disaster response or planetary exploration.
|
| |
| 15:30-17:00, Paper MoBIP-05.13 | Add to My Program |
| AcroMonk: A Minimalist Underactuated Brachiating Robot |
|
| Javadi, Mahdi | German Research Center for Artificial Intelligence Robotics Inn |
| Harnack, Daniel | Deutsches Forschungszentrum F�r K�nstliche Intelligenz |
| Stocco, Paula | Stanford University |
| Kumar, Shivesh | DFKI GmbH |
| Vyas, Shubham | Robotics Innovation Center, DFKI GmbH |
| Pizzutilo, Daniel | DFKI RIC |
| Kirchner, Frank | University of Bremen |
Keywords: Underactuated Robots, Biologically-Inspired Robots, Education Robotics
Abstract: Brachiation is a dynamic, coordinated swinging maneuver of body and arms used by monkeys and apes to move between branches. As a unique underactuated mode of locomotion, it is interesting to study from a robotics perspective since it can broaden the deployment scenarios for humanoids and animaloids. While several brachiating robots of varying complexity have been proposed in the past, this paper presents the simplest possible prototype of a brachiation robot, using only a single actuator and unactuated grippers. The novel passive gripper design allows it to snap on and release from monkey bars, while guaranteeing well defined start and end poses of the swing. The brachiation behavior is realized in three different ways, using trajectory optimization via direct collocation and stabilization by a model-based time-varying linear quadratic regulator (TVLQR) or model-free proportional derivative (PD) control, as well as by a reinforcement learning (RL) based control policy. The three control schemes are compared in terms of robustness to disturbances, mass uncertainty, and energy consumption. The system design and controllers have been open-sourced. Due to its minimal and open design, the system can serve as a canonical underactuated platform for education and research.
|
| |
| 15:30-17:00, Paper MoBIP-05.14 | Add to My Program |
| Design and Verification of Parallelogram Mechanism with Geared Unit Rolling Joints for Reliable Wiring |
|
| Suh, Jungwook | Kyungpook National University (KNU) |
| Choi, Wontae | Kyungpook National University (KNU) |
Keywords: Mechanism Design, Tendon/Wire Mechanism, Kinematics
Abstract: The structure of 1-DOF joints used in existing robots is generally a revolute joint or a prismatic joint. However, recently, attempts have been made to apply rolling joints to reduce the size and weight of surgical and humanoid robots. In this study, to secure the advantages of wire routing through robot joints, a new method for applying geared rolling units to a parallelogram mechanism is proposed. First, a kinematic analysis of the proposed gear-based mechanism is explained in comparison with the existing pivot-based mechanism. In addition, the importance of the radii of the gears is verified through force analysis to prevent damage to the applied gears, as well as through the analysis of actuation torque and singular positions, in which the parallelogram can convert into an anti-parallelogram. The effect of stable wiring was verified through an experiment using a cable-driven prototype. Consequently, the proposed parallelogram composed of rolling units is expected to be applied to various robot configurations owing to its advantages.
|
| |
| MoBIP-06 Regular session, Hall E |
Add to My Program |
| Clone of 'Modeling, Control, and Learning for Soft Robots II' |
|
| |
| |
| 15:30-17:00, Paper MoBIP-06.1 | Add to My Program |
| Vine Robot Localization Via Collision |
|
| Frias-Miranda, Eugenio | Purdue University |
| Srivastava, Alankriti | Purdue University |
| Wang, Sicheng | Purdue University |
| Blumenschein, Laura | Purdue University |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Robot Materials and Design, Localization
Abstract: Localization of robots is a complex task that is often hindered by the sensors these systems use. Due to the majority of field robots being rigid, most of these sensing modalities have the same common faults, such as performance being hindered when their camera vision is obscured. In addition, rigid systems lack flexibility when traversing multiple environments: especially when traversing uneven and unpredictable ground. Soft robots, which can adaptably interact with the environment, could serve as a solution to both problems. One specific soft robot, the Vine Robot, has exhibited excellent performance while moving through constrained, unpredictable environments. This makes the Vine Robot an ideal candidate for a novel method of sensing and localizing in environments, obstacle collision localization. We use our understanding of the nature of Vine Robot motion to be able to predict the tip position of the robot at every instant based on sensor feedback. Through the single obstacle experiments, it was found that our algorithm can provide a precise picture of the tip position of the robot in differing environments. Further, in a multi obstacle demonstration, less than 5 percent max error relative to the full robot length was observed on the path prediction. Our study helps lay the foundation for a new method for Vine Robot localization using contact as a new sensing modality.
|
| |
| 15:30-17:00, Paper MoBIP-06.2 | Add to My Program |
| Mapping Unknown Environments through Passive Deformation of Soft, Growing Robots |
|
| Fuentes, Francesco | Purdue University |
| Blumenschein, Laura | Purdue University |
Keywords: Modeling, Control, and Learning for Soft Robots, Mapping
Abstract: When faced with an unstructured environment filled with an unknown number and size of obstacles on a chaotic terrain, it can be a challenge to determine the best method of navigating and mapping the space. This problem, known as Simultaneous Localization and Mapping (SLAM), has typically been approached using vision-based solutions, but these solutions require clear visual conditions in order to function optimally. A different approach to sensing environments has been explored in soft robotic systems, specifically by sensing changes in the environment through sensing changes in the robot's configuration. Building on this idea, we introduce a method of mapping based on colliding with and deforming around obstacles using a soft, growing robot. Instead of avoiding obstacles, as is typically done to protect robots, we take advantage of the soft, growing robot's compliance in order to navigate through, and collect information about, the environment. Through the construction and testing of a geometry-based simulation, we analyzed the behavior and effectiveness of this approach for mapping by generating random launch positions and collecting information from contacted obstacles and traversed regions. Through a plethora of randomly generated environments, we determine that: 1) the density of obstacles in an environment has minimal impact on mapping abilities and 2) at least 70% of each environment tested can be mapped by deploying 20 or fewer soft, growing robots.
|
| |
| 15:30-17:00, Paper MoBIP-06.3 | Add to My Program |
| Stable Real-Time Feedback Control of a Pneumatic Soft Robot |
|
| Even, Sean | University of Notre Dame |
| Zheng, Tongjia | University of Notre Dame |
| Lin, Hai | University of Notre Dame |
| Ozkan-Aydin, Yasemin | University of Notre Dame |
Keywords: Modeling, Control, and Learning for Soft Robots, Biomimetics, Soft Robot Applications
Abstract: Soft actuators offer compliant and safe interaction with an unstructured environment compared to their rigid counterparts. However, control of these systems is often challenging because they are inherently under-actuated, have infinite degrees of freedom (DoF), and their mechanical properties can change by unknown external loads. Existing works mainly relied on discretization and reduction, suffering from either low accuracy or high computational cost for real-time control purposes. Recently, we presented an infinite-dimensional feedback controller for soft manipulators modeled by partial differential equations (PDEs) based on the Cosserat rod theory. In this study, we examine how to implement this controller in real-time using only a limited number of actuators. To do so, we formulate a convex quadratic programming problem that tunes the feedback gains of the controller in real-time such that it becomes realizable by the actuators. We evaluated the controller's performance through experiments on a physical soft robot capable of planar motions and show that the actual controller implemented by the finite-dimensional actuators still preserves the stabilizing property of the desired infinite-dimensional controller. This research fills the gap between the infinite-dimensional control design and finite-dimensional actuation in practice and suggests a promising direction for exploring PDE-based control design for soft robots.
|
| |
| 15:30-17:00, Paper MoBIP-06.4 | Add to My Program |
| Real2Sim2Real Transfer for Control of Cable-Driven Robots Via a Differentiable Physics Engine |
|
| Wang, Kun | Amazon.com LLC |
| Johnson, William | Yale University |
| Lu, Shiyang | Rutgers University |
| Huang, Xiaonan | University of Michigan |
| Booth, Joran | Yale University |
| Kramer-Bottiglio, Rebecca | Yale University |
| Aanjaneya, Mridul | Rutgers University |
| Bekris, Kostas E. | Rutgers, the State University of New Jersey |
Keywords: Modeling, Control, and Learning for Soft Robots, Simulation and Animation, Model Learning for Control
Abstract: Tensegrity robots, composed of rigid rods and flexible cables, exhibit high strength-to-weight ratios and significant deformations, which enable them to navigate unstructured terrains and survive harsh impacts. They are hard to control, however, due to high dimensionality, complex dynamics, and a coupled architecture. Physics-based simulation is a promising avenue for developing locomotion policies that can be transferred to real robots. Nevertheless, modeling tensegrity robots is a complex task due to a substantial sim2real gap. To address this issue, this paper describes a Real2Sim2Real (R2S2R) strategy for tensegrity robots. This strategy is based on a differentiable physics engine that can be trained given limited data from a real robot. These data include offline measurements of physical properties, such as mass and geometry for various robot components, and the observation of a trajectory using a random control policy. With the data from the real robot, the engine can be iteratively refined and used to discover locomotion policies that are directly transferable to the real robot. Beyond the R2S2R pipeline, key contributions of this work include computing non-zero gradients at contact points, a loss function for matching tensegrity locomotion gaits, and a trajectory segmentation technique that avoids conflicts in gradient evaluation during training. Multiple iterations of the R2S2R process are demonstrated and evaluated on a real 3-bar tensegrity robot.
|
| |
| 15:30-17:00, Paper MoBIP-06.5 | Add to My Program |
| Multi-Gait Locomotion Planning and Tracking for Tendon-Actuated Terrestrial Soft Robot (TerreSoRo) |
|
| Mahendran, Arun Niddish | The University of Alabama, Tuscaloosa |
| Freeman, Caitlin | University of Alabama |
| Chang, Alexander | Georgia Institute of Technology |
| McDougall, Michael | University of Strathclyde Glasgow |
| Vikas, Vishesh | University of Alabama |
| Vela, Patricio | Georgia Institute of Technology |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Robot Applications, Motion and Path Planning
Abstract: The adaptability of soft robots makes them ideal candidates to maneuver through unstructured environments. However, locomotion challenges arise due to complexities in modeling the body mechanics, actuation, and robot-environment dynamics. These factors contribute to the gap between their potential and actual autonomous field deployment. A closed-loop path planning framework for soft robot locomotion is critical to close the real-world realization gap. This paper presents a generic path planning framework applied to TerreSoRo (Tetra-Limb Terrestrial Soft Robot) with pose feedback. It employs a gait-based, lattice trajectory planner to facilitate navigation in the presence of obstacles. The locomotion gaits are synthesized using a data-driven optimization approach that allows for learning from the environment. The trajectory planner employs a greedy breadth-first search strategy to obtain a collision-free trajectory. The synthesized trajectory is a sequence of rotate-then-translate gait pairs. The control architecture integrates high-level and low-level controllers with real-time localization (using an overhead webcam). TerreSoRo successfully navigates environments with obstacles where path re-planning is performed. To best of our knowledge, this is the first instance of real-time, closed-loop path planning of a non-pneumatic soft robot.
|
| |
| 15:30-17:00, Paper MoBIP-06.6 | Add to My Program |
| Learning Soft Robot Dynamics Using Differentiable Kalman Filters and Spatio-Temporal Embeddings |
|
| Liu, Xiao | Arizona State University |
| Ikemoto, Shuhei | Kyushu Institute of Technology |
| Yoshimitsu, Yuhei | Kyushu Institute of Technology |
| Ben Amor, Heni | Arizona State University |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Robot Applications, Soft Robot Materials and Design
Abstract: This paper introduces a novel approach for modeling the dynamics of soft robots, utilizing a differentiable filter architecture. The proposed approach enables end-to-end training to learn system dynamics, noise characteristics, and temporal behavior of the robot. A novel spatio-temporal embedding process is discussed to handle observations with varying sensor placements and sampling frequencies. The efficacy of this approach is demonstrated on a tensegrity robot arm by learning end-effector dynamics from demonstrations with complex bending motions. The model is proven to be robust against missing modalities, diverse sensor placement, and varying sampling rates. Additionally, the proposed framework is shown to identify physical interactions with humans during motion. The utilization of a differentiable filter presents a novel solution to the difficulties of modeling soft robot dynamics. Our approach shows substantial improvement in accuracy compared to state-of-the-art filtering methods, with at least a 24% reduction in mean absolute error (MAE) observed. Furthermore, the predicted end-effector positions show an average MAE of 25.77mm from the ground truth, highlighting the advantage of our approach. The code is available at https: //github.com/ir-lab/soft_robot_DEnKF.
|
| |
| 15:30-17:00, Paper MoBIP-06.7 | Add to My Program |
| Closed Loop Control of Tendon Driven Continuum Robots Using IMUs |
|
| Srivastava, Manu | Clemson University |
| Groff, Richard | Clemson University |
| Walker, Ian | Clemson University |
Keywords: Modeling, Control, and Learning for Soft Robots, Sensor-based Control, Biomimetics
Abstract: In this paper, we present a new approach to the control of continuum robot sections using IMU quaternion feedback. We use a discrete time root finding algorithm to drive a continuum section in the desired shape space direction. We found that the approach lacks end effector positioning accuracy when used by itself, however, when used in conjunction with a feedforward model it actively counters the influence of unmodeled factors. The approach is implemented on a single section of a continuum hose robot developed for 3D printing of concrete in construction applications. The results demonstrate significant improvements in positioning accuracy compared to standalone kinematics/mechanics-based position control of tendon lengths. Additionally, this approach can be implemented using low cost sensing and control hardware. Index Terms�Continuum robot, IMU, tendons, control.
|
| |
| 15:30-17:00, Paper MoBIP-06.8 | Add to My Program |
| Machine Learning Best Practices for Soft Robot Proprioception |
|
| Zhang, Annan | Massachusetts Institute of Technology |
| Wang, Tsun-Hsuan | Massachusetts Institute of Technology |
| Truby, Ryan | Northwestern University |
| Chin, Lillian | Massachusetts Institute of Technology |
| Rus, Daniela | MIT |
Keywords: Modeling, Control, and Learning for Soft Robots, Performance Evaluation and Benchmarking, Soft Sensors and Actuators
Abstract: Machine learning-based approaches for soft robot proprioception have recently gained popularity, in part due to the difficulties in modeling the relationship between sensor signals and robot shape. However, to date, there exists no systematic analysis of the required design choices to set up a machine learning pipeline for soft robot proprioception. Here, we present the first study examining how design choices on different levels of the machine learning pipeline affect the performance of a neural network for predicting the state of a soft robot. We address the most frequent questions researchers face, such as how to choose the appropriate sensor and actuator signals, process input and output data, deal with time series, and pick the best neural network architecture. By testing our hypotheses on data collected from two vastly different systems--an electrically actuated robotic platform and a pneumatically actuated soft trunk--we seek conclusions that may generalize beyond one specific type of soft robot and hope to provide insights for researchers to use machine learning for soft robot proprioception.
|
| |
| 15:30-17:00, Paper MoBIP-06.9 | Add to My Program |
| Modeling and Analysis of Tendon-Driven Continuum Robots for Rod-Based Locking |
|
| Rao, Priyanka | University of Toronto |
| Pogue, Chloe | University of Toronto |
| Peyron, Quentin | Inria and CRIStAL UMR CNRS 9189, University of Lille |
| Diller, Eric D. | University of Toronto |
| Burgner-Kahrs, Jessica | University of Toronto |
Keywords: Modeling, Control, and Learning for Soft Robots, Flexible Robotics, Kinematics
Abstract: Various design modifications have been proposed for tendon-driven continuum robots to improve their stiffness and workspace. One of them is using locking mechanisms to constrain the lengths of rods or passive backbones along the backbone length. However, physics-based models used to predict these robots' behaviour commonly assume that the curvature of the locked portion does not change during robot actuation or that the effects of friction and gravity are negligible. In addition, these models do not consider the variation in twist on the application of force. In this letter, we propose a 3D static model for tendon-driven continuum robots experiencing locking due to length constraints on rods along their backbone. The proposed model is evaluated on prototypes of length 240 mm, with up to three locking mechanisms and has an accuracy of 3.63% w.r.t. length. Using the proposed model, a compliance analysis is performed studying the evolution of the robot compliance with the position of the locking mechanisms. An actuation strategy is proposed that can allow the robot to achieve the same shape with different compliance.
|
| |
| 15:30-17:00, Paper MoBIP-06.10 | Add to My Program |
| Path Planning Method with Constant Bending Angle Constraint for Soft Growing Robot Using Heat Welding Mechanism |
|
| Satake, Yuki | Waseda University |
| Ishii, Hiroyuki | Waseda University |
Keywords: Modeling, Control, and Learning for Soft Robots, Motion and Path Planning
Abstract: Soft growing robots, a new soft mobile robot, have recently attracted considerable interest. There are many soft growing robots, some of which have irreversible growing and bending motion mechanisms. For such robots, path planning methods that provide information on bending timing improve operation efficiency. Although various path planning methods have already been developed, they cannot be applied to our growing robot that uses a heat welding mechanism because it has a constraint that all bending angles are the constant value. This article proposes a novel path planning method with constant bending angle constraint. The proposed bending algorithm was developed based on the rapidly-exploring random tree star (RRT*) algorithm. The method incorporates an algorithm for reducing unnecessary nodes from obtained paths keeping bending angles constant and improving path optimality. We confirmed that the proposed method generates paths whose bending angles are constant. In addition, we experimented with moving our robot along the path in the field with some obstacles. The result showed that the proposed method enabled the robot to reach to target place, avoiding obstacles. The proposed method improves the operating efficiency of our soft growing robot.
|
| |
| 15:30-17:00, Paper MoBIP-06.11 | Add to My Program |
| Static Shape Control of Soft Continuum Robots Using Deep Visual Inverse Kinematic Models (I) |
|
| Almanzor, Elijah | University of Cambridge |
| Ye, Fan | University of Cambridge |
| Shi, Jialei | University College London |
| George Thuruthel, Thomas | University College London |
| Wurdemann, Helge Arne | University College London |
| Iida, Fumiya | University of Cambridge |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Robot Applications, Deep Learning in Robotics and Automation, Medical Robots and Systems
Abstract: Soft continuum robots are highly flexible and adaptable, making them ideal for unstructured environments such as the human body and agriculture. However, their high compliance and manoeuvrability make them difficult to model, sense, and control. Current control strategies focus on Cartesian space control of the end-effector, but few works have explored full-body control. This study presents a novel image-based deep learning approach for closed-loop kinematic shape control of soft continuum robots. The method combines a local inverse kinematics formulation in the image-space with deep convolutional neural networks for accurate shape control that is robust to feedback noise and mechanical changes in the continuum arm. The shape controller is fast and straightforward to implement; it takes only a few hours to generate training data, train the network, and deploy, requiring only a web camera for feedback. This method offers an intuitive and user-friendly way to control the robot's 3D shape and configuration through teleoperation using only 2D hand-drawn images of the desired target state without the need for further user instruction or consideration of the robot's kinematics.
|
| |
| 15:30-17:00, Paper MoBIP-06.12 | Add to My Program |
| Model Predictive Control Applied to Different Time-Scale Dynamics of Flexible Joint Robots |
|
| Iskandar, Maged | German Aerospace Center - DLR |
| van Ommeren, Christiaan | Technical University of Munich |
| Wu, Xuwei | German Aerospace Center (DLR) |
| Albu-Sch�ffer, Alin | DLR - German Aerospace Center |
| Dietrich, Alexander | German Aerospace Center (DLR) |
Keywords: Modeling, Control, and Learning for Soft Robots, Compliance and Impedance Control, Compliant Joints and Mechanisms
Abstract: Modern Lightweight robots are constructed to be collaborative, which often results in a low structural stiffness compared to conventional rigid robots. Therefore, the controller must be able to handle the dynamic oscillatory effect mainly due to the intrinsic joint elasticity. Singular perturbation theory makes it possible to decompose the flexible joint dynamics into fast and slow subsystems. This model separation provides additional features to incorporate future knowledge of the joint level dynamical behavior within the controller design using the Model Predictive Control (MPC) technique. In this study, different architectures are considered that combine the method of Singular Perturbation and MPC. For Singular Perturbation, the parameters that influence the validity of using this technique to control a flexible-joint robot are investigated. Furthermore, limits on the input constraints for the future trajectory are considered with MPC. The position control performance and robustness against external forces of each architecture are validated experimentally for a flexible joint robot. The experimental validation shows superior performance in practice for the presented MPC framework, especially respecting the actuator torque limits
|
| |
| 15:30-17:00, Paper MoBIP-06.13 | Add to My Program |
| A Framework for Simulation of Magnetic Soft Robots Using the Material Point Method |
|
| Davy, Joshua | University of Leeds |
| Lloyd, Peter Robert | University of Leeds |
| Chandler, James Henry | University of Leeds |
| Valdastri, Pietro | University of Leeds |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Robot Materials and Design, Simulation and Animation
Abstract: Simulation represents a key aspect in the develop- ment of robot systems. The ability to simulate behavior of real- world robots provides an environment where robot designs can be developed and control systems optimized. Due to the use of external magnetic fields for actuation, magnetic soft robots can be wirelessly controlled and are easily miniaturized. However, the relationship between magnetic soft materials and external sources of magnetic fields present significant complexities in modelling due to the relationship between material elasticity and magnetic wrench (forces and torques). In this work, we present a simulation framework for magnetic soft robots using the Material Point Method (MPM) which integrates hyper- elastic material models with the magnetic wrench induced under external fields. Compared to existing Finite Element Methods (FEM), the presented MPM based framework inher- ently models self-collision between areas of the model and can capture the effect of forces in non-homogeneous magnetic fields. We demonstrate the ability of the MPM framework to model the influence of magnetic wrench on magnetic soft robots, capture dynamic behavior of robots under time-varying magnetic fields, and provide an accurate representation of deformation when colliding with obstacles. We show the versatility of MPM framework by comparing simulations to a range of real- world magnetic soft robot designs previously presented in the literature.
|
| |
| MoBIP-07 Regular session, Hall E |
Add to My Program |
| Clone of 'Micro and Nano Robotics' |
|
| |
| |
| 15:30-17:00, Paper MoBIP-07.1 | Add to My Program |
| Design, Fabrication, and Characterization of a Helical Adaptive Multi-Material MicroRobot (HAMMR) |
|
| Tan, Liyuan | Purdue University |
| Cappelleri, David | Purdue University |
Keywords: Micro/Nano Robots, Medical Robots and Systems
Abstract: Adaptive locomotion is an advanced function of microrobots that can be achieved using smart materials. In this paper, a responsive hydrogel is utilized as a smart material and used to fabricate Helical Adaptive Multi-material MicroRobots (HAMMRs) with deformable tails to achieve adaptive locomotion capabilities. Moreover, a novel fabrication method is proposed to realize these types of helical microrobots with enhanced swimming performances by taking advantage of a strong magnetic head and deformable tail. The deformations of different tail designs and the fabricated microrobots are tested in different solvents. The swimming performances of the swimming microrobots are investigated experimentally under a rotating magnetic field and verified with theoretical calculations. The HAMMRs show significant deformations upon stimulation and changes in swimming performance which are in agreement with the scaled calculation result. Finally, the HAMMRs present an enhanced mobility with a highest published translational velocity for an adaptive swimming microrobot of 8.1 body lengths per second.
|
| |
| 15:30-17:00, Paper MoBIP-07.2 | Add to My Program |
| Active Capsule System for Multiple Therapeutic Patch Delivery: Preclinical Evaluation |
|
| Lee, Jihun | Daegu Gyeongbuk Institute of Science and Technology |
| Hoang, Manh Cuong | Chonnam National University |
| Kim, Jayoung | Korea Institute of Medical Microrobotics |
| Choe, Eunho | Korea Institute of Medical Microrobotics |
| Kee, Hyeonwoo | DGIST |
| Yang, Seungun | DGIST |
| Park, Jongoh | Chonnam National University |
| Park, Sukho | DGIST |
Keywords: Micro/Nano Robots, Medical Robots and Systems, Automation at Micro-Nano Scales
Abstract: Recently, active research has been conducted on the therapeutic functions of capsule endoscopes. Here, we propose an active capsule system that captures images of the interior of the gastrointestinal tract (GI) and actively delivers therapeutic patches. The active capsule system mainly comprises therapeutic patches, an active capsule equipped with a camera, and a robot-assisted magnetic actuator. The active capsule moves inside the GI tract via a magnetic actuator using a robot, captures pictures of the GI tract in actual time, and performs hemostatic treatment by delivering therapeutic patches to the target lesions. First, the fundamental performance of the active capsule system was verified via a hemostatic performance test of the therapeutic patch, patch contamination prevention test of the active capsule, and basic actuation test of the capsule. Second, multiple therapeutic patches were delivered to the gastric surface in an ex vivo test using an active capsule system. Finally, as a preclinical test, it was confirmed that the GI tract examination and the therapeutic patches delivery were possible using the active capsule system through an animal test using a porcine. Consequently, the proposed active capsule system represents a new paradigm for capsule endoscopy with multiple therapeutic patch delivery capabilities.
|
| |
| 15:30-17:00, Paper MoBIP-07.3 | Add to My Program |
| Parallel Cell Array Patterning and Target Cell Lysis on an Optoelectronic Micro-Well Device |
|
| Gan, Chunyuan | Beihang University |
| Xiong, Hongyi | Beihang University |
| Zhao, Jiawei | Beihang University, School of Mechanical Engineering and Automati |
| Wang, Ao | BUAA |
| Wang, Chutian | Beihang University |
| Liang, Shuzhang | Beihang University |
| Zhang, Jiaying | Beihang University, School of Mechanical Engineering &Automation |
| Feng, Lin | Beihang University |
Keywords: Biological Cell Manipulation, Automation at Micro-Nano Scales, Micro/Nano Robots
Abstract: This work presents a novel electrical method, implemented in the form of a microfluidic device, for cell arraying and target cell lysis. The microfluidic device contains a micro-well array on the photoconductive layer based on the optoelectronic tweezers (OET) method, where parallel cell manipulation is performed. As cell suspension flows over the micro-wells, cells can be actively captured in the micro-wells by light-induced dielectrophoresis (DEP) forces, form the designed pattern array in less than 120 s. The single-cell capture rate is over 83% in the patterned cell array, and about 94% of micro-wells are occupied by cells. Then, the target cell in the specific micro-well is illuminated and lysed by electroporation in 5 seconds. The micro-well barriers and DEP forces block the influence of the flow, and a relatively closed space is critical to preserve the cell lysates. Through experiments, light-induced DEP force cell capture and target cell electroporation can be modulated by changing the light patterns and the applied signal. This device, based on the OET and dynamic electroporation, allows the rapidity in the cell capture and target lysis at the single-cell level and can enable single-cell-based studies, such as molecular diagnostics and disease detection.
|
| |
| 15:30-17:00, Paper MoBIP-07.4 | Add to My Program |
| Microrobot Control Method Based on Movement of Field Free Point in Gradient Magnetic Field |
|
| Wang, Chutian | Beihang University |
| Ji, Yiming | Beihang University |
| Luo, Xinyun | Beihang University |
| Gan, Chunyuan | Beihang University |
| Wang, Ao | BUAA |
| Zhao, Jiawei | Beihang University, School of Mechanical Engineering and Automati |
| Wang, Luyao | Beihang University |
| Feng, Lin | Beihang University |
Keywords: Micro/Nano Robots, Force Control, Motion Control
Abstract: The untethered microrobots driven by multiple external physics fields have promising ability in minimally invasive disease treatments. One common type of the driving fields is gradient magnetic field, which can provide microrobots with adequate driving force in complicated environment. In this study, a control method of microrobot through gradient magnetic field system is presented, which is realized by moving the field free point (FFP) to produce an alterable magnetic driving force. A confirmatory experiment of the robot reciprocating motion control is undertaken in a 1D gradient magnetic robot system. The control method could be applied to further studies on in vivo applications of targeted microrobot drug delivery system.
|
| |
| 15:30-17:00, Paper MoBIP-07.5 | Add to My Program |
| Helical Propulsion in Low-Re Numbers with Near-Zero Angle of Attack |
|
| Ligtenberg, Leendert-Jan Wouter | University of Twente |
| Ekkelkamp, Ilse Alena Antonia | University of Twente |
| Halfwerk, Frank | University of Twente |
| Goulas, Constantinos | University of Twente |
| Arens, Jutta | University of Twente |
| Warle, Michiel | Radboud University Medical Center |
| Khalil, Islam S.M. | University of Twente |
Keywords: Micro/Nano Robots, Medical Robots and Systems
Abstract: One approach to the wireless actuation and gravity compensation of untethered helical magnetic devices (UHMD) is through swimming with a non-zero angle of attack (AoA). This configuration allows us to counteract gravity, so that for a given desired path, we can move the UHMD controllably without drifting downward under its own weight. This study seeks to investigate the use a reduced-order model of the complex 6-degrees-of-freedom model of UHMDs in low Reynolds-number regime. A one-dimensional model representing the relative position of the UHMD with respect to an actuator rotating permanent magnet is used to predict a gap which yields bounded behavior of the open-loop system. Using geometric representation of the reduced-order model, the local bounded behavior of the UHMD with near-zero AoA is attributed to periodic active magnetic suspension, which dominates near-zero AoA. Our numerical results are verified experimentally and bounded behavior of the UHMD demonstrates the capability to swim with near-zero AoA (6.3◦ � 2.2◦) without drifting downward. With this actuation strategy, it is unlikely that the orientation of the UHMD will be needed during noninvasive localization, making the control system dependent on only its position with respect to a prescribed trajectory. This strategy will also provide a computational advantage in adjusting the gap between the UHMD and a robotically controlled rotating permanent magnet actuator.
|
| |
| 15:30-17:00, Paper MoBIP-07.6 | Add to My Program |
| Influence of Nanoparticle Coating on the Differential Magnetometry and Wireless Actuation of Biohybrid Microrobots |
|
| Magdanz, Veronika | University of Waterloo |
| Cumming, Jack | University of Twente |
| Salamzadeh, Sadaf | University of Twente |
| Tesselaar, Sven | University of Twente |
| Lejla, Alic | University of Twente |
| Abelmann, Leon | University of Twente |
| Khalil, Islam S.M. | University of Twente |
Keywords: Micro/Nano Robots
Abstract: Magnetic nanoparticles can be electrostatically assembled around sperm cells to form biohybrid microrobots. These biohybrid microrobots possess sufficient magnetic material to potentially allow for pulse-echo localization and wireless actuation. Alternatively, magnetic excitation of these nanoparticles can be used for localization based on Faraday�s law of induction using a detection coil. Here, we investigate the influence of the electrostatic attraction between positively charged nanoparticles and negatively charged sperm cells on the activation of the nanoparticles during nonlinear differential magnetometry and wireless magnetic actuation. Activation of clusters of free nanoparticles and nanoparticles bound to the body of sperm cells is achieved by a combination of a highfrequency alternating field and a pulsating static field. The nonlinear response in both cases indicates that constraining the nanoparticles is likely to yield significant decreases in the magnetometry sensitivity. While the attachment of particles to the cells enables wireless actuation (rolling locomotion), the rate of change of the magnetization of the nanoparticles decreases one order of magnitude compared to free nanoparticles.
|
| |
| 15:30-17:00, Paper MoBIP-07.7 | Add to My Program |
| Using Piezoceramic-Actuated Stages in Precision Long-Stroke Motion Systems: A Design Procedure |
|
| Al-Rawashdeh, Yazan | Memorial University of Newfoundland |
| Al Saaideh, Mohammad | Memorial University of Newfoundland |
| Al Janaideh, Mohammad | University of Guelph |
Keywords: Automation at Micro-Nano Scales
Abstract: Mainly, the integration of fine positioning piezo-actuated stages in precision motion systems is considered, which results in multi-stage configurations. Mostly, in such configurations, the fine stages are attached to the coarse positioning stages- that do not meet required precision- by mechanical means. Once the motion is synchronized, the fine stages enhance the overall precision of the multi-stage system. Undesirably, mechanical, and electromagnetic interference between the involved stages take place, which may limit the possible attainable precision. To control the fine stages, we propose the use of feedforward control based on the Prandtl�Ishlinskii model inverse in an attempt to accommodate related piezoceramics dynamic behavior and hysteresis. Targeting the semiconductor manufacturing, the needed multi-stage design steps according to the herein proposed approach are outlined. Also, the performance of a representative precision motion system comprising a planner coarse stage, and a uni-axial fine stage under step-and-scan trajectories is assessed. The results show that the proposed piezo-actuated fine stage improves the scanning accuracy of the overall motion system.
|
| |
| 15:30-17:00, Paper MoBIP-07.8 | Add to My Program |
| Buoyancy Enabled Non-Inertial Dynamic Walking |
|
| Yim, Mark | University of Pennsylvania |
| Gosrich, Walker | University of Pennsylvania |
| Miskin, Marc | University of Pennsylvania |
Keywords: Micro/Nano Robots, Legged Robots
Abstract: We propose a mechanism for low Reynolds num- ber walking (e.g., legged microscale robots). Whereas loco- motion for legged robots has traditionally been classified as dynamic (where inertia plays a role) or static (where the system is always statically stable), we introduce a new locomotion modality we call buoyancy enabled non-inertial dynamic walking in which inertia plays no role, yet the robot is not statically stable. Instead, falling and viscous drag play critical roles. This model assumes squeeze flow forces from fluid interactions combined with a well timed gait as the mechanism by which forward motion can be achieved from a reciprocating legged robot. Using two physical demonstrations of robots with Reynold�s number ranging from 0.0001 to 0.02 (a microscale robot in water and a centimeter scale robot in glycerol) we find the model qualitatively describes the motion. This model can help understand microscale locomotion and design new microscale walking robots including controlling forward and backwards motion and potentially steering these robots.
|
| |
| 15:30-17:00, Paper MoBIP-07.9 | Add to My Program |
| Ultrafast Acoustic Holography with Physics-Reinforced Self-Supervised Learning for Precise Robotic Manipulation |
|
| Lu, Qingyi | Shanghaitech University |
| Zhong, Chengxi | ShanghaiTech University |
| Liu, Qing | Shanghaitech University |
| Li, Teng | Tsinghua University |
| Su, Hu | Institute of Automation, Chinese Academy of Science |
| Liu, Song | ShanghaiTech University |
Keywords: Micro/Nano Robots, Dexterous Manipulation, Deep Learning Methods
Abstract: Ultrafast acoustic holography (AH) enabling dynamic contactless micro-nano robotic manipulation has recently attracted wide attention. As an advanced technique, AH encodes specific three-dimensional (3D) acoustic field on a two-dimensional (2D) hologram whereby realizing holographic reconstruction with high fidelity. However, current approaches face the limitation of encoding time, accuracy and flexibility, thus, leading to inapplicability for dynamic and precise robotic manipulation. Here, we develop an approach to overcome these issues. Its basic idea is to use a convolutional neural network trained in a self-supervised manner with iterative interaction with virtual physical environment. Energy conservation is incorporated to access the physical constrain during wave propagation. The experimental results demonstrate that the proposed method circumvents laborious annotated dataset preparation and boosts the reinforcement from physics model. By the validation and comparison on distinct acoustic fields with various patterns, the accuracy and real-time performance of the proposed method are confirmed supporting dynamic and precise robotic manipulation.
|
| |
| 15:30-17:00, Paper MoBIP-07.10 | Add to My Program |
| Surface Navigation of Alginate Artificial Cells in Mucus Solutions |
|
| Rogowski, Louis | Applied Research Associates |
| Wood, Justin | Applied Research Associates |
| Cooke, Tobias | Applied Research Associates |
| Kararsiz, Gokhan | Southern Methodist University |
| Kim, MinJun | Southern Methodist University |
Keywords: Micro/Nano Robots, Medical Robots and Systems, Soft Robot Applications
Abstract: Alginate hydrogels are widely researched in pharmaceutical applications for their abilities to encapsulate and disperse therapeutics in response to stimuli. While effective, their utility can be greatly improved once converted into artificial cell soft-microrobots, allowing them to actively navigate through complex in vivo environments and facilitate targeted drug delivery. In this study, artificial cells were fabricated by crosslinking alginate with magnetic nanoparticles and then deployed within mucus solutions to characterize their propulsion capabilities. The goal of this study was to understand how variations in simplified gastrointestinal fluid, artificial cell properties, and magnetic field characteristics could affect surface locomotion. A comparison between automatic feedback control and manual �open-loop� operation was also quantitatively explored. Under feedback control, individual artificial cells were navigated with automatically generated waypoints and a PID controller. Simulations were used to verify controller performance and accuracy. User operation was carried out using an Xbox controller, where the joystick could directly change navigation direction. We conclude in this study that the surface navigation of artificial cells is highly predictable within mucus concentrations and that both feedback and open-loop control are equally successful in navigation.
|
| |
| 15:30-17:00, Paper MoBIP-07.11 | Add to My Program |
| Design and Control of Microscale Dual Locomotion Mode Multi-Functional Robots (μDMMFs) |
|
| Davis, Aaron C. | Purdue University |
| Cappelleri, David | Purdue University |
Keywords: Micro/Nano Robots
Abstract: This paper presents the design and control of a novel microrobot that utilizes two distinct magnetic locomotion methods, a combination of rotating and gradient field control, for precise micro-object manipulation using multiple end-effectors. Rotating magnetic fields induce a tumbling locomotion mode to increase the movement speed and decrease issues associated with stiction and locomotion over rough surfaces. The gradient field control allows for precise manipulation using the end-effectors, which include a pointed tip for splitting groups of objects and a blunt end for pushing or capturing objects. The microrobot is fabricated using a two-photon polymerization 3D printer, allowing for the precise reproduction of complex geometries and designs. The potential applications of this technology in the medical field are discussed, highlighting the potential for in vitro cellular manipulation.
|
| |
| 15:30-17:00, Paper MoBIP-07.12 | Add to My Program |
| A New 1-Mg Fast Unimorph SMA-Based Actuator for Microrobotics |
|
| Trygstad, Conor | Washington State University |
| Nguyen, Xuan-Truc | University of Southern California |
| Perez-Arancibia, Nestor O | Washington State University (WSU) |
Keywords: Micro/Nano Robots, Biologically-Inspired Robots, Methods and Tools for Robot System Design
Abstract: We present a new unimorph actuator for microrobotics, which is driven by thin shape-memory alloy (SMA) wires. Using a passive-capillary-alignment technique and existing SMA-microsystem fabrication methods, we developed an actuator that is 7 mm long, has a volume of 0.45 mm 3, weighs 0.96 mg, and can achieve operation frequencies of up to 40 Hz as well as lift 155 times its own weight. To demonstrate the capabilities of the proposed actuator, we created an 8-mg crawler, the MiniBug, and a bioinspired 56-mg controllable water-surface-tension crawler, the WaterStrider. The MiniBug is 8.5 mm long, can locomote at speeds as high as 0.76 BL/s ( body-lengths per second), and is the lightest fully-functional crawling microrobot of its type ever created. The WaterStrider is 22 mm long, and can locomote at speeds of up to 0.28 BL/s as well as execute turning maneuvers at angular rates on the order of 0.144 rad/s. The WaterStrider is the lightest controllable SMA-driven water-surface-tension crawler developed to date.
|
| |
| 15:30-17:00, Paper MoBIP-07.13 | Add to My Program |
| Toward Sub-Gram Helicopters: Designing a Miniaturized Flybar for Passive Stability |
|
| Johnson, Kyle | University of Washington Paul G. Allen School for Computer Scien |
| Arroyos, Vicente | University of Washington |
| Villanueva, Raul | University of Washington |
| Schulz, Adriana | MIT |
| Fuller, Sawyer | University of Washington |
| Iyer, Vikram | University of Washington |
Keywords: Micro/Nano Robots, Mechanism Design, Aerial Systems: Mechanics and Control
Abstract: Sub-gram flying robots have transformative potential in applications from search and rescue to precision agriculture to environmental monitoring. However, a key gap in achieving autonomous flight for these applications is the low lift to weight ratio of flapping wing and quadrotor designs around 1~g or less. To close this gap, we propose a helictoper-style design that minimizes size and weight by leveraging the high lift, reliability, and low-voltage of sub-gram motors. We take an important step to enable this goal by designing a light-weight, micfrofabricated flybar mechanism to passively stabilize such a robot. Our 48 mg flybar is folded from a flat carbon fiber laminate into a 3D mechanism that couples tilting of the flybar to a change in the angle of attack of the rotors. Our design uses flexure joints instead of ball-in-socket joints common in larger flybars. To expedite the design exploration and optimization of a microfabricated flat-folded flybar, we develop a novel user-in-the-loop bi-level optimization workflow that combines Bayesian optimization design tools and expert feedback. We develop four template designs and use this method to achieve a peak damping ratio of 0.528, an 18.9x improvement from our initial design. Compared to a flybar-less rotor with a near 0 damping ratio, our flybar-rotor mechanism maintains a stable roll and pitch with relative deviations <1�. Our results show that, if combined with a counter-torque mechanism such as a tail rotor, our miniaturized flybar could mechanically provide attitude stability for a sub-gram helicopter.
|
| |
| 15:30-17:00, Paper MoBIP-07.14 | Add to My Program |
| Manipulation of Optical Force-Induced Micro-Assemblies at the Air-Liquid Interface |
|
| Carlisle, Nicholas | Massey University |
| Williams, Martin | Massey University |
| Whitby, Catherine | Massey University |
| Nock, Volker | University of Canterbury |
| Chen, Jack L Y | AUT |
| Avci, Ebubekir | Massey University |
Keywords: Micro/Nano Robots, Automation at Micro-Nano Scales, Swarm Robotics
Abstract: Colloidal particles trapped by a focused laser at the air-liquid interface provide an interesting assembly dynamic. In this study, we demonstrated manipulating optical force-induced swarms via dynamic locomotion of assemblies built with holographic optical tweezers. This manipulation approach builds the foundation for autonomous control of building assemblies at the air-liquid interface, which is the first time optical micro-robots have performed this feat. Our proposed semi-autonomous control allows users to produce small dynamic secondary assemblies at the interface, which are transported to and merged with a main static assembly. This static-dynamic approach grows assemblies up to ∼2.1 times larger than conventional methods. Manipulation and control of large-scale optical force-induced assemblies in real-time to create re-configurable swarms has the potential to lead the development of new technology and approaches for complex tasks, such as the development of new material, transportation of biological matter, studying biofilm formation created by bacteria colonies at the air-liquid interface, and more.
|
| |
| MoBIP-08 Regular session, Hall E |
Add to My Program |
| Clone of 'Legged Robots II' |
|
| |
| |
| 15:30-17:00, Paper MoBIP-08.1 | Add to My Program |
| Creating a Dynamic Quadrupedal Robotic Goalkeeper with Reinforcement Learning |
|
| Huang, Xiaoyu | Georgia Institute of Technology |
| Li, Zhongyu | University of California, Berkeley |
| Xiang, Yanzhen | ETH Zurich |
| Ni, Yiming | University of California Berkeley |
| Chi, Yufeng | University of California, Berkeley |
| Li, Yunhao | University of California, Berkeley |
| Yang, Lizhi | California Institute of Technology |
| Peng, Xue Bin | Simon Fraser University |
| Sreenath, Koushil | University of California, Berkeley |
Keywords: Legged Robots, Reinforcement Learning, Whole-Body Motion Planning and Control
Abstract: We present a reinforcement learning (RL) framework that enables quadrupedal robots to perform soccer goalkeeping tasks in the real world. Soccer goalkeeping with quadrupeds is a challenging problem, that combines highly dynamic locomotion with precise and fast non-prehensile object (ball) manipulation. The robot needs to react to and intercept a potentially flying ball using dynamic locomotion maneuvers in a very short amount of time, usually less than one second. In this paper, we propose to address this problem using a hierarchical model-free RL framework. The first component of the framework contains multiple control policies for distinct locomotion skills, which can be used to cover different regions of the goal. Each control policy enables the robot to track random parametric end-effector trajectories while performing one specific locomotion skill, such as jump, dive, and sidestep. These skills are then utilized by the second part of the framework which is a high-level planner to determine a desired skill and end-effector trajectory in order to intercept a ball flying to different regions of the goal. We deploy the proposed framework on a Mini Cheetah quadrupedal robot and demonstrate the effectiveness of our framework for various agile interceptions of a fast-moving ball in the real world.
|
| |
| 15:30-17:00, Paper MoBIP-08.2 | Add to My Program |
| Walking in Narrow Spaces: Safety-Critical Locomotion Control for Quadrupedal Robots with Duality-Based Optimization |
|
| Liao, Qiayuan | University of California, Berkeley |
| Li, Zhongyu | University of California, Berkeley |
| Thirugnanam, Akshay | University of California, Berkeley |
| Zeng, Jun | University of California, Berkeley |
| Sreenath, Koushil | University of California, Berkeley |
Keywords: Legged Robots, Collision Avoidance, Optimization and Optimal Control
Abstract: This paper presents a safety-critical locomotion control framework for quadrupedal robots. Our goal is to enable quadrupedal robots to safely navigate in cluttered environments. To tackle this, we introduce exponential Discrete Control Barrier Functions~(exponential DCBFs) with duality-based obstacle avoidance constraints into a Nonlinear Model Predictive Control~(NMPC) with Whole-Body Control~(WBC) framework for quadrupedal locomotion control. This enables us to use polytopes to describe the shapes of the robot and obstacles for collision avoidance while doing locomotion control of quadrupedal robots. Compared to most prior work, especially using CBFs, that utilize spherical and conservative approximation for obstacle avoidance, this work demonstrates a quadrupedal robot autonomously and safely navigating through very tight spaces in the real world.
|
| |
| 15:30-17:00, Paper MoBIP-08.3 | Add to My Program |
| ARMP: Autoregressive Motion Planning for Quadruped Locomotion and Navigation in Complex Indoor Environments |
|
| Kim, Jeonghwan | Georgia Institute of Technology |
| Li, Tianyu | Facebook |
| Ha, Sehoon | Georgia Institute of Technology |
Keywords: Legged Robots, Task and Motion Planning, Simulation and Animation
Abstract: Generating natural and physically feasible motions for legged robots has been a challenging problem due to its complex dynamics. In this work, we introduce a novel learning-based framework of autoregressive motion planner (ARMP) for quadruped locomotion and navigation. Our method can generate motion plans with an arbitrary length in an autoregressive fashion, unlike most offline trajectory optimization algorithms for a fixed trajectory length. To this end, we first construct the motion library by solving a dense set of trajectory optimization problems for diverse scenarios and parameter settings. Then we learn the motion manifold from the dataset in a supervised learning fashion. We show that the proposed ARMP can generate physically plausible motions for various tasks and situations. We also showcase that our method can be successfully integrated with the recent robot navigation frameworks as a low-level controller and unleash the full capability of legged robots for complex indoor navigation.
|
| |
| 15:30-17:00, Paper MoBIP-08.4 | Add to My Program |
| Perceptive Hexapod Legged Locomotion for Climbing Joist Environments |
|
| Zang, Zixian | University of California, Berkeley |
| Kawawa-Beaudan, Maxime | J.P. Morgan AI Research |
| Yu, Wenhao | Google |
| Zhang, Tingnan | Google |
| Zakhor, Avideh | University of California, Berkeley |
Keywords: Legged Robots, Reinforcement Learning
Abstract: Attics are one of the largest sources of energy loss in residential homes, but they are uncomfortable and dangerous for human workers to conduct air sealing and insulation. Hexapod robots are potentially suitable for carrying out those tasks in tight attic spaces since they are stable, compact, and lightweight. For hexapods to succeed in these tasks, they must be able to navigate inside tight attic spaces of single-family residential homes in the U.S., which typically contain rows of approximately 6 or 8-inch tall joists placed 16 inches apart from each other. Climbing over such obstacles is challenging for autonomous robotics systems. In this work, we develop a perceptive walking model for legged hexapods that can traverse over terrain with random joist structures using egocentric vision. Our method can be used on low-cost hardware not requiring real-time joint state feedback. We train our model in a teacher-student fashion with 2 phases: In phase 1, we use reinforcement learning with access to privileged information such as local elevation maps and joint feedback. In phase 2, we use supervised learning to distill the model into one with access to only onboard observations, consisting of egocentric depth images and robot orientation captured by a tracking camera. We demonstrate zero-shot sim-to-real transfer on a SpiderPi robot, equipped with a depth camera onboard, climbing over joist courses we construct to simulate the environment in the field. Our proposed method achieves nearly 100% success rate climbing over the test courses, significantly outperforming the model without perception and the controller provided by the manufacturer.
|
| |
| 15:30-17:00, Paper MoBIP-08.5 | Add to My Program |
| Design of STARQ: A Multimodal Quadrupedal Robot for Running, Climbing, and Swimming |
|
| Vasquez, Derek A. | Florida State University |
| Jay, David | FAMU-FSU College of Engineering |
| Dina, Michael | Florida State University |
| Austin, Max | Florida State University |
| McConomy, Shayne | FAMU - FSU College of Engineering |
| Clark, Jonathan | Florida State University |
Keywords: Legged Robots, Climbing Robots, Biologically-Inspired Robots
Abstract: Legged animals have developed a variety of modes of locomotion to adapt to the diverse and unknown terrain challenges posed in the natural world. Legged robots, however, have been largely limited to specializing in one domain, with few that have endeavored to bridge the gap between two. In this work we present the Scansorial, Terrestrial, and Aquatic Robot Quadruped (STARQ), a novel legged robot capable of bridging three different domains with three modes of locomotion: walking, climbing, and swimming. In this study we describe model-based design techniques as well as design innovations that have made multimodal locomotion possible including waterproof hips for 2-DOF high torque legs, legs capable of effective power transmission in three modes, and bi-directionally compliant feet for walking and attaching to vertical surfaces. To demonstrate the robot's capabilities we present locomotion test data including speed and cost of transport in each of these domains. We also demonstrate the capability to transition from walking to swimming in a natural environment.
|
| |
| 15:30-17:00, Paper MoBIP-08.6 | Add to My Program |
| Hierarchical Adaptive Control for Collaborative Manipulation of a Rigid Object by Quadrupedal Robots |
|
| Sombolestan, Mohsen | University of Southern California |
| Nguyen, Quan | University of Southern California |
Keywords: Legged Robots, Robust/Adaptive Control, Mobile Manipulation
Abstract: Despite the potential benefits of collaborative robots, effective manipulation tasks with quadruped robots remain difficult to realize. In this paper, we propose a hierarchical control system that can handle real-world collaborative manipulation tasks, including uncertainties arising from object properties, shape, and terrain. Our approach consists of three levels of controllers. Firstly, an adaptive controller computes the required force and moment for object manipulation without prior knowledge of the object's properties and terrain. The computed force and moment are then optimally distributed between the team of quadruped robots using a Quadratic Programming (QP)-based controller. This QP-based controller optimizes each robot's contact point location with the object while satisfying constraints associated with robot-object contact. Finally, a decentralized loco-manipulation controller is designed for each robot to apply manipulation force while maintaining the robot's stability. We successfully validated our approach in a high-fidelity simulation environment where a team of quadruped robots manipulated an unknown object weighing up to 18 kg on different terrains while following the desired trajectory.
|
| |
| 15:30-17:00, Paper MoBIP-08.7 | Add to My Program |
| Proprioception and Reaction for Walking among Entanglements |
|
| Yim, Justin K. | University of Illinois Urbana-Champaign |
| Ren, Jiming | Carnegie Mellon University |
| Ologan, David | Carnegie Mellon University |
| Garcia Gonzalez, Selvin Orlando | Carnegie Mellon University |
| Johnson, Aaron M. | Carnegie Mellon University |
Keywords: Legged Robots, Force and Tactile Sensing
Abstract: Entanglements like vines and branches in natural settings or cords and pipes in human spaces prevent mobile robots from accessing many environments. Legged robots should be effective in these settings, and more so than wheeled or tracked platforms, but naive controllers quickly become entangled and stuck. In this paper we present a method for proprioception aimed specifically at the task of sensing entanglements of a robot's legs as well as a reaction strategy to disentangle legs during their swing phase as they advance to their next foothold. We demonstrate our proprioception and reaction strategy enables traversal of entanglements of many stiffnesses and geometries succeeding in 14 out of 16 trials in laboratory tests, as well as a natural outdoor environment.
|
| |
| 15:30-17:00, Paper MoBIP-08.8 | Add to My Program |
| Learning a Single Policy for Diverse Behaviors on a Quadrupedal Robot Using Scalable Motion Imitation |
|
| Klipfel, Arnaud | Georgia Tech |
| Sontakke, Nitish Rajnish | Georgia Institute of Technology |
| Liu, Ren | Georgia Institute of Technology |
| Ha, Sehoon | Georgia Institute of Technology |
Keywords: Legged Robots, Reinforcement Learning, Imitation Learning
Abstract: Learning various motor skills for quadrupedal robots is a challenging problem that requires careful design of task-specific mathematical models or reward descriptions. In this work, we propose to learn a single capable policy using deep reinforcement learning by imitating a large number of reference motions, including walking, turning, pacing, jumping, sitting, and lying. On top of the existing motion imitation framework, we first carefully design the observation space, the action space, and the reward function to improve the scalability of the learning as well as the robustness of the final policy. In addition, we adopt a novel adaptive motion sampling (AMS) method, which maintains a balance between successful and unsuccessful behaviors. This technique allows the learning algorithm to focus on challenging motor skills and avoid catastrophic forgetting. We demonstrate that the learned policy can exhibit diverse behaviors in simulation by successfully tracking both the training dataset and out-of-distribution trajectories. We also validate the importance of the proposed learning formulation and the adaptive motion sampling scheme by conducting experiments.
|
| |
| 15:30-17:00, Paper MoBIP-08.9 | Add to My Program |
| A Novel Lockable Spring-Loaded Prismatic Spine to Support Agile Quadrupedal Locomotion |
|
| Ye, Keran | University of California, Riverside |
| Chung, Kenneth | University of California, Riverside |
| Karydis, Konstantinos | University of California, Riverside |
Keywords: Legged Robots, Mechanism Design, Compliant Joints and Mechanisms
Abstract: This paper introduces a way to systematically investigate the effect of compliant prismatic spines in quadrupedal robot locomotion. We develop a novel spring-loaded lockable spine module, together with a new Spinal Compliance-Integrated Quadruped (SCIQ) platform for both empirical and numerical research. Individual spine tests reveal beneficial spinal characteristics like a degressive spring, and validate the efficacy of a proposed compact locking/unlocking mechanism for the spine. Benchmark vertical jumping and landing tests with our robot show comparable jumping performance between the rigid and compliant spines. An observed advantage of the compliant spine module is that it can alleviate more challenging landing conditions by absorbing impact energy and dissipating the remainder via feet slipping through much in cat-like stretching fashion.
|
| |
| 15:30-17:00, Paper MoBIP-08.10 | Add to My Program |
| Tunable Impact and Vibration Absorbing Neck for Robust Visual-Inertial State Estimation for Dynamic Legged Robots |
|
| Kim, Taekyun | Seoul National University |
| Kim, Sangbae | Massachusetts Institute of Technology |
| Lee, Dongjun | Seoul National University |
Keywords: Legged Robots, Mechanism Design, Visual-Inertial SLAM
Abstract: We propose a new neck design for legged robots to achieve robust visual-inertial state estimation in dynamic locomotion. While visual-inertial state estimation is widely used in robotics, it has a problem of being disturbed by the impacts and vibration generated when legged robots move dynamically. The use of rubber dampers may be a solution, but even if the dampers are proper for some gaits, they may be excessively deformed or resonated at certain frequencies during other gait locomotion since they are not tunable. To address this problem, we develop a tunable neck system that absorbs the impacts and vibration during diverse gait locomotions. This neck system consists of two components: 1) a suspension mechanism that compensates for the weight of the head equipped with a camera and IMU (inertial measurement unit), absorbs the impacts and the head motion of high frequencies including vibration as a fixed low-pass filter; and 2) a dynamic vibration absorber (DVA) that can be reactively-adjusted to diverse gait frequencies to alleviate excessive head movements. We present a dynamics analysis of the neck system and show how to adjust the target frequency of the system. Simulation and experimental validation are performed to verify the effect of the proposed neck design, manifesting superior estimation performance and robustness across diverse gaits.
|
| |
| 15:30-17:00, Paper MoBIP-08.11 | Add to My Program |
| Embodying Quasi-Passive Modal Trotting and Pronking in a Sagittal Quadruped |
|
| Calzolari, Davide | German Aerospace Center, Technical University of Munich |
| Della Santina, Cosimo | TU Delft |
| Giordano, Alessandro Massimo | DLR (German Aerospace Center) |
| Schmidt, Annika | Technical University of Munich (TUM) |
| Albu-Sch�ffer, Alin | DLR - German Aerospace Center |
Keywords: Legged Robots, Natural Machine Motion, Passive Walking
Abstract: Animals rely on the elasticity of their tendons and muscles to execute robust and efficient locomotion patterns for a vast and continuous range of velocities. Replicating such capabilities in artificial systems is a long-lasting challenge in robotics. By taking advantage of a pitch dynamics decoupling spring potential, this work aims to provide design rules and a control strategy to generate dynamic, efficient locomotion patterns in quadrupeds moving in a sagittal plane. We rely on nonlinear modal theory, which provides the tools to characterize continuous families of efficient oscillations in nonlinear mechanical systems. We provide simulations of an elastic quadruped showing that the proposed solution can robustly excite efficient locomotion patterns under non-ideal conditions.
|
| |
| 15:30-17:00, Paper MoBIP-08.12 | Add to My Program |
| Design, Modeling and Control of a Quadruped Robot SPIDAR: Spherically Vectorable and Distributed Rotors Assisted Air-Ground Quadruped Robot |
|
| Zhao, Moju | The University of Tokyo |
| Anzai, Tomoki | The University of Tokyo |
| Nishio, Takuzumi | The University of Tokyo |
Keywords: Legged Robots, Aerial Systems: Mechanics and Control, Motion Control
Abstract: Multimodal locomotion capability is an emerging topic in robotics field, and various novel mobile robots have been developed to enable the maneuvering in both terrestrial and aerial domains. Among these hybrid robots, several state-ofthe- art bipedal robots enable the complex walking motion which is interlaced with flying. These robots are also desired to have the manipulation ability; however, it is difficult for the current forms to keep stability with the joint motion in midair due to the centralized rotor arrangement. Therefore, in this work, we develop a novel air-ground quadruped robot called SPIDAR which is assisted by spherically vectorable rotors distributed in each link to enable both walking motion and transformable flight. First, we present a unique mechanical design for quadruped robot that enables terrestrial and aerial locomotion. We then reveal the modeling method for this hybrid robot platform, and further develop an integrated control strategy for both walking and flying with joint motion. Finally, we demonstrate the feasibility of the proposed hybrid quadruped robot by performing a seamless motion that involves static walking and subsequent flight. To the best of our knowledge, this work is the first to achieve a quadruped robot with multimodal locomotion capability.
|
| |
| MoBIP-09 Regular session, Hall E |
Add to My Program |
| Clone of 'Motion and Path Planning II' |
|
| |
| |
| 15:30-17:00, Paper MoBIP-09.1 | Add to My Program |
| Time-Optimal Spiral Trajectories with Closed-Form Solutions |
|
| Draelos, Mark | University of Michigan |
Keywords: Motion and Path Planning, Dynamics
Abstract: The Archimedean spiral is space-filling plane curve that is found in applications ranging from coverage path planning for robot exploration to scan pattern generation for medical imaging. The constant linear velocity (CLV) parameterization of this spiral is of particular interest due to its fixed path velocity and isotropic sampling capability, but the high accelerations near its origin singularity yield poor trajectory tracking that limit its utility. Here, I derive a closed-form time-optimal time scaling for CLV spirals with large path velocities that mitigates the singularity by inspecting the CLV spiral's acceleration envelope. When applied to two degree-of-freedom Cartesian scanner, I demonstrate that this approach reduces trajectory tracking error by up to 97.1% as compared to naive CLV spirals with low computational overhead. I further show that this time scaling eliminates the central image distortion near the origin for scanning applications that rely on CLV spirals.
|
| |
| 15:30-17:00, Paper MoBIP-09.2 | Add to My Program |
| Optimal Path Planning through a Sequence of Waypoints |
|
| Goutham, Mithun | Ohio State University |
| Boyle, Stephen | Ohio State University |
| Menon, Meghna | Ford Motor Company |
| Mohan, Shankar | Ford |
| Garrow, Sarah | Ford Motor Company |
| Stockar, Stephanie | Ohio State University |
Keywords: Motion and Path Planning, Intelligent and Flexible Manufacturing, Industrial Robots
Abstract: This paper presents a deterministic approach for finding the optimal path through a sequence of spatial waypoints while accounting for vertex or turn costs. A case study is presented where the proposed algorithm is used to determine the optimal path through a sequence of waypoints. This is then compared with the path obtained when considering only two consecutive waypoints at a time. Further, an approximation that uses three waypoints at a time in a staggered manner is described. This approach is shown to be computationally efficient and finds the optimal path in a case study with 2000 waypoints.
|
| |
| 15:30-17:00, Paper MoBIP-09.3 | Add to My Program |
| Efficient Path Planning in Manipulation Planning Problems by Actively Reusing Validation Effort |
|
| Hartmann, Valentin Noah | University of Stuttgart |
| Ortiz-Haro, Joaquim | University of Stuttgart |
| Toussaint, Marc | TU Berlin |
Keywords: Motion and Path Planning, Manipulation Planning
Abstract: The path planning problems arising in manipulation planning and in task and motion planning settings are typically repetitive: the same manipulator moves in a space that only changes slightly. Despite this potential for reuse of information, few planners fully exploit the available information. To better enable this reuse, we decompose the collision checking into reusable, and non-reusable parts. We then treat the sequences of path planning problems in manipulation planning as a multiquery path planning problem. This allows the usage of planners that actively minimize planning effort over multiple queries, and by doing so, actively reuse previous knowledge. We implement this approach in EIRM* and effort ordered LazyPRM*, and benchmark it on multiple simulated robotic examples. Further, we show that the approach of decomposing collision checks additionally enables the reuse of the gained knowledge over multiple different instances of the same problem, i.e., in a multiquery manipulation planning scenario. The planners using the decomposed collision checking outperform the other planners in initial solution time by up to a factor of two while providing a similar solution quality.
|
| |
| 15:30-17:00, Paper MoBIP-09.4 | Add to My Program |
| Improving Reliable Navigation under Uncertainty Via Predictions Informed by Non-Local Information |
|
| Arnob, Raihan Islam | George Mason University |
| Stein, Gregory | George Mason University |
Keywords: Motion and Path Planning, Autonomous Agents, AI-Enabled Robotics
Abstract: We improve reliable, long-horizon, goal-directed navigation in partially-mapped environments by using nonlocally available information to predict the goodness of temporally-extended actions that enter unseen space. Making predictions about where to navigate in general requires nonlocal information: any observations the robot has seen so far may provide information about the goodness of a particular direction of travel. Building on recent work in learning augmented model-based planning under uncertainty, we present an approach that can both rely on non-local information to make predictions (via a graph neural network) and is reliable by design: it will always reach its goal, even when learning does not provide accurate predictions. We conduct experiments in three simulated environments in which non-local information is needed to perform well. In our large scale university building environment, generated from real-world floorplans to the scale, we demonstrate a 9.3% reduction in cost-to-go compared to a non-learned baseline and a 14.9% reduction compared to a learning-informed planner that can only use local information to inform its predictions.
|
| |
| 15:30-17:00, Paper MoBIP-09.5 | Add to My Program |
| TOP-UAV: Open-Source Time-Optimal Trajectory Planner for Point-Masses under Acceleration and Velocity Constraints |
|
| Meyer, Fabian | FZI Forschungszentrum Informatik |
| Glock, Katharina | FZI Forschungszentrum Informatik |
| Sayah, David | FZI Forschungszentrum Informatik |
Keywords: Motion and Path Planning, Optimization and Optimal Control, Kinematics
Abstract: In the latest research for unmanned aerial vehicles (UAVs), time-optimal trajectory planning of a point-mass with acceleration as control input and constrained maximum velocity (TOT-PMAV) has proved to be very promising for UAV behavior planning. They can be calculated within microseconds and tracked with high precision by modern trajectory tracking controllers like model predictive control (MPC). However, recent research shows that the state-of-the-art (SOTA) approach to generate these time-optimal trajectories is based on an invalid method to synchronize the coordinate axes which sometimes yields trajectories that miss the desired final state by far. Hence, an alternative approach was proposed that claims to resolve this issue. However, it still needs mathematical proof of its correctness. In this work, we provide the missing proof and mathematically demonstrate the problems arising from the SOTA approach. Further, since neither the SOTA nor the alternative solution approach utilizes the full kinematic capacity of a UAV, we propose an improved solution approach to the TOT-PMAV that better exploits kinematic properties and yields, on average, up to 14% faster trajectories. We substantiate our findings with an extensive computational study, show in which situations the SOTA is likely to fail and provide metrics to measure the consequence during failure. To enable reproducibility, our code is open-source.
|
| |
| 15:30-17:00, Paper MoBIP-09.6 | Add to My Program |
| Fast Asymptotically Optimal Path Planning in Dynamic, Uncertain Environments |
|
| Huang, Lu | City University of Hongkong |
| Jing, Xingjian | City University of Hong Kong |
Keywords: Motion and Path Planning
Abstract: This paper presents Fast Adaptive Tree (FAT), an asymptotically-optimal sampling-based path planner for dynamic and uncertain scenarios. Namely, the solution extracted converges to the optimal solution given the sensor information as the number of samples approaches infinity. The planner maintains an underlying graph, which increasingly approximates the search domain, and a dynamic spanning tree of the graph, which contains the shortest path from the start to the goal state. The planner quickly responds to the availability of new information about the environments or the robot movements by minimally repairing the spanning tree over the navigation. The simulation results show that the proposed path planner achieves higher efficiency of replanning than several state-of-the-art path planners without sacrificing solution quality.
|
| |
| 15:30-17:00, Paper MoBIP-09.7 | Add to My Program |
| An Efficient Trajectory Planner for Car-Like Robots on Uneven Terrain |
|
| Xu, Long | Zhejiang University |
| Chai, Kaixin | Xi'an Jiaotong University |
| Han, Zhichao | Zhejiang University |
| Liu, Hong | Hangzhou City University |
| Xu, Chao | Zhejiang University |
| Cao, Yanjun | Zhejiang University, Huzhou Institute of Zhejiang University |
| Gao, Fei | Zhejiang University |
Keywords: Motion and Path Planning, Nonholonomic Motion Planning, Autonomous Vehicle Navigation
Abstract: Autonomous navigation of ground robots on uneven terrain is being considered in more and more tasks. However, uneven terrain will bring two problems to motion planning: how to assess the traversability of the terrain and how to cope with the dynamics model of the robot associated with the terrain. The trajectories generated by existing methods are often too conservative or cannot be tracked well by the controller since the second problem is not well solved. In this paper, we propose terrain pose mapping to describe the impact of terrain on the robot. With this mapping, we can obtain the SE(3) state of the robot on uneven terrain for a given state in SE(2). Then, based on it, we present a trajectory optimization framework for car-like robots on uneven terrain that can consider both of the above problems. The trajectories generated by our method conform to the dynamics model of the system without being overly conservative and yet able to be tracked well by the controller. We perform simulations and real-world experiments to validate the efficiency and trajectory quality of our algorithm.
|
| |
| 15:30-17:00, Paper MoBIP-09.8 | Add to My Program |
| Robots As AI Double Agents: Privacy in Motion Planning |
|
| Shome, Rahul | The Australian National University |
| Kingston, Zachary | Rice University |
| Kavraki, Lydia | Rice University |
Keywords: Motion and Path Planning
Abstract: Robotics and automation are poised to change the landscape of home and work in the near future. Robots are adept at deliberately moving, sensing, and interacting with their environments. The pervasive use of this technology promises societal and economic payoffs due to its capabilities---conversely, the capabilities of robots to move within and sense the world around them is susceptible to abuse. Robots, unlike typical sensors, are inherently autonomous, active, and deliberate. Such automated agents can become AI double agents liable to violate the privacy of coworkers, privileged spaces, and other stakeholders. In this work we highlight the understudied and inevitable threats to privacy that can be posed by the autonomous, deliberate motions and sensing of robots. We frame the problem within broader sociotechnological questions alongside a comprehensive review. The privacy-aware motion planning problem is formulated in terms of cost functions that can be modified to induce privacy-aware behavior---preserving, agnostic, or violating. Simulated case studies in manipulation and navigation, with altered cost functions, are used to demonstrate how privacy-violating threats can be easily injected, sometimes with only small changes in performance (solution path lengths). Such functionality is already widely available. This preliminary work is meant to lay the foundations for near-future, holistic, interdisciplinary investigations that can address questions surrounding privacy in intelligent robotic behaviors determined by planning algorithms.
|
| |
| 15:30-17:00, Paper MoBIP-09.9 | Add to My Program |
| Bang-Bang Boosting of RRTs |
|
| LaValle, Alexander J. | University of Oulu |
| Sakcak, Basak | University of Oulu |
| LaValle, Steven M | University of Oulu |
Keywords: Motion and Path Planning
Abstract: This paper presents methods for dramatically improving the performance of sampling-based kinodynamic planners. The key component is a complete, exact steering method that produces a time-optimal trajectory between any states for a vector of synchronized double integrators. This method is applied in three ways: 1) to generate RRT edges that quickly solve the two-point boundary-value problems, 2) to produce a (quasi)metric for more accurate Voronoi bias in RRTs, and 3) to iteratively time-optimize a given collision-free trajectory. Experiments are performed for state spaces with up to 2000 dimensions, resulting in improved computed trajectories and orders of magnitude computation time improvements over using ordinary metrics and constant controls.
|
| |
| 15:30-17:00, Paper MoBIP-09.10 | Add to My Program |
| Geometric Gait Optimization for Inertia-Dominated Systems with Nonzero Net Momentum |
|
| Yang, Yanhao | Oregon State University |
| Hatton, Ross | Oregon State University |
Keywords: Nonholonomic Motion Planning, Motion and Path Planning, Nonholonomic Mechanisms and Systems
Abstract: Inertia-dominated mechanical systems can achieve net displacement by 1) periodically changing their shape (known as kinematic gait) and 2) adjusting their inertia distribution to utilize the existing nonzero net momentum (known as momentum gait). Therefore, finding the gait that most effectively utilizes the two types of locomotion in terms of the magnitude of the net momentum is a significant topic in the study of locomotion. For kinematic locomotion with zero net momentum, the geometry of optimal gaits is expressed as the equilibria of system constraint curvature flux through the surface bounded by the gait, and the cost associated with executing the gait in the metric space. In this paper, we identify the geometry of optimal gaits with nonzero net momentum effects by lifting the gait description to a time-parameterized curve in shape-time space. We also propose the variational gait optimization algorithm corresponding to the lifted geometric structure, and identify two distinct patterns in the optimal motion, determined by whether or not the kinematic and momentum gaits are concentric. The examples of systems with and without fluid-added mass demonstrate that the proposed algorithm can efficiently solve forward and turning locomotion gaits in the presence of nonzero net momentum. At any given momentum and effort limit, the proposed optimal gait that takes into account both momentum and kinematic effects outperforms the reference gaits that each only considers one of these effects.
|
| |
| 15:30-17:00, Paper MoBIP-09.11 | Add to My Program |
| Real-Time Tube-Based Non-Gaussian Risk Bounded Motion Planning for Stochastic Nonlinear Systems in Uncertain Environments Via Motion Primitives |
|
| Han, Weiqiao | Massachusetts Institute of Technology |
| M. Jasour, Ashkan | MIT |
| Williams, Brian | MIT |
Keywords: Motion and Path Planning, Autonomous Vehicle Navigation, Probability and Statistical Methods
Abstract: We consider the motion planning problem for stochastic nonlinear systems in uncertain environments. More precisely, in this problem the robot has stochastic nonlinear dynamics and uncertain initial locations, and the environment contains multiple dynamic uncertain obstacles. Obstacles can be of arbitrary shape, can deform, and can move. All uncertainties do not necessarily have Gaussian distribution. This general setting has been considered and solved in [1]. In addition to the assumptions above, in this paper, we consider long-term tasks, where the planning method in [1] would fail, as the uncertainty of the system states grows too large over a long time horizon. Unlike [1], we present a real-time online motion planning algorithm. We build discrete-time motion primitives and their corresponding continuous-time tubes offline, so that almost all system states of each motion primitive are guaranteed to stay inside the corresponding tube. We convert probabilistic safety constraints into a set of deterministic constraints called risk contours. During online execution, we verify the safety of the tubes against deterministic risk contours using sum-of-squares (SOS) programming. The provided SOS-based method verifies the safety of the tube in the presence of uncertain obstacles without the need for uncertainty samples and time discretization in real-time. By bounding the probability the system states staying inside the tube and bounding the probability of the tube colliding with obstacles, our approach guarantees bounded probability of system states colliding with obstacles. We demonstrate our approach on several long-term robotics tasks.
|
| |
| 15:30-17:00, Paper MoBIP-09.12 | Add to My Program |
| Parallelized Control-Aware Motion Planning with Learned Controller Proxies |
|
| Chow, Scott | Oregon State University |
| Chang, Dongsik | Amazon |
| Hollinger, Geoffrey | Oregon State University |
Keywords: Motion and Path Planning, Integrated Planning and Control, Integrated Planning and Learning
Abstract: Kinodynamic motion planning enables autonomous robots to find efficient paths while minimizing energy expenditure and avoiding hazards in the environment. However, during plan execution, the controller may deviate from the collision-free path found by the planner due to discrepancies between planning and control, causing inaccurate estimation of path costs and potentially collisions with obstacles. While this can be mitigated by incorporating the vehicle controller into planning, these approaches are generally bottlenecked by the high computation cost of simulating the vehicle dynamics and controller. This paper presents the Parallel Closed-Loop RRT* motion planner that uses a fast neural network controller as a substitute for a computationally-demanding controller during planning. Using a neural network controller and parallelizing the planning process makes closed-loop planning tractable for vehicles with nonlinear dynamics and significantly reduces planning time. Experiments on a simulated underwater vehicle with a model predictive controller demonstrate that our approach yields feasible plans that are more likely to be successfully executed without collisions compared to planners that do not consider the controller.
|
| |
| 15:30-17:00, Paper MoBIP-09.13 | Add to My Program |
| Improvement of Submodular Maximization Problems with Routing Constraints Via Submodularity and Fourier Sparsity |
|
| Lin, Pao-Te | National Central University |
| Tseng, Kuo-Shih | National Central University |
Keywords: Mapping, Search and Rescue Robots, Motion and Path Planning
Abstract: Various robotic problems (e.g., map exploration, environmental monitoring and spatial search) can be formulated as submodular maximization problems with routing constraints. These problems involve two NP-hard problems, maximal coverage and traveling salesman problems. The generalized cost-benefit algorithm (GCB) is able to solve this problem with a frac{1}{2}(1-frac{1}{e})widetilde{OPT} guarantee, where widetilde{OPT} is the approximation of optimal performance. There is a gap between the widetilde{OPT} and the optimal solution (OPT). In this research, the proposed algorithms, Tree-Structured Fourier Supports Set (TS-FSS), utilize the submodularity and sparsity of routing trees to boost GCB performance. The theorems show that the proposed algorithms have a higher optimum bound than GCB. The experiments demonstrate that the proposed approach outperforms benchmark approaches.
|
| |
| MoBIP-10 Regular session, Hall E |
Add to My Program |
| Clone of 'Learning for Manipulation II' |
|
| |
| |
| 15:30-17:00, Paper MoBIP-10.1 | Add to My Program |
| Learning Bifunctional Push-Grasping Synergistic Strategy for Goal-Agnostic and Goal-Oriented Tasks |
|
| Ren, Dafa | Shanghai University |
| Wu, Shuang | Huawei |
| Wang, Xiaofan | Shanghai University |
| Peng, Yan | Shanghai University |
| Ren, Xiaoqiang | Shanghai University |
Keywords: Grasping, Reinforcement Learning, Deep Learning in Grasping and Manipulation
Abstract: Both goal-agnostic and goal-oriented tasks have practical value for robotic grasping: goal-agnostic tasks target all objects in the workspace, while goal-oriented tasks aim at grasping pre-assigned goal objects. However, most current grasping methods are only better at coping with one task. In this work, we propose a bifunctional push-grasping synergistic strategy for goal-agnostic and goal-oriented grasping tasks. Our method integrates pushing along with grasping to pick up all objects or pre-assigned goal objects with high action efficiency depending on the task requirement. We introduce a bifunctional network, which takes in visual observations and outputs dense pixel-wise maps of Q values for pushing and grasping primitive actions, to increase the available samples in the action space. Then we propose a hierarchical reinforcement learning framework to coordinate the two tasks by considering the goal-agnostic task as a combination of multiple goal-oriented tasks. To reduce the training difficulty of the hierarchical framework, we design a two-stage training method to train the two types of tasks separately. We perform pre-training of the model in simulation, and then transfer the learned model to the real world without any additional real-world fine-tuning. Experimental results show that the proposed approach outperforms existing methods in task completion rate and grasp success rate with less motion number. Supplementary material is available at https://github.com/DafaRen/Learning_Bifunctional_Push-grasping_Synergistic_Strategy_for_Goal-agnostic_and_Goal-oriented_Tasks
|
| |
| 15:30-17:00, Paper MoBIP-10.2 | Add to My Program |
| Visual Spatial Attention and Proprioceptive Data-Driven Reinforcement Learning for Robust Peg-In-Hole Task under Variable Conditions |
|
| Yasutomi, Andr� Yuji | Hitachi Ltd |
| Ichiwara, Hideyuki | Hitachi, Ltd. / Waseda University |
| Ito, Hiroshi | Hitachi, Ltd |
| Mori, Hiroki | Waseda University |
| Ogata, Tetsuya | Waseda University |
Keywords: Robotics and Automation in Construction, Reinforcement Learning, Deep Learning for Visual Perception
Abstract: Anchor-bolt insertion is a peg-in-hole task performed in the construction field for holes in concrete. Efforts have been made to automate this task, but the variable lighting and hole surface conditions, as well as the requirements for short setup and task execution time make the automation challenging. In this study, we introduce a vision and kinesthetic data-driven robot control model for this task that is robust to challenging lighting and hole surface conditions. This model consists of a spatial attention point network (SAP) and a deep reinforcement learning (DRL) policy that are trained jointly end-to-end to control the robot. The model is trained in an offline manner, with a sample-efficient framework designed to reduce training time and minimize the reality gap when transferring the model to the physical world. Through evaluations with an industrial robot performing the task in 12 unknown holes, starting from 16 different initial positions, and under three different lighting conditions (two with misleading shadows), we demonstrate that SAP can generate relevant attention points of the image even in challenging lighting conditions. We also show that the proposed model enables task execution with higher success rate and shorter task completion time than various baselines. Due to the proposed model's high effectiveness even in severe lighting, initial positions, and hole conditions, and the offline training framework's high sample-efficiency and short training time, this approach can be easily applied to construction.
|
| |
| 15:30-17:00, Paper MoBIP-10.3 | Add to My Program |
| Domain Adaptation on Point Clouds for 6D Pose Estimation in Bin-Picking Scenarios |
|
| Zhao, Liang | Tsinghua University |
| Sun, Meng | Tsinghua University |
| Lv, Weijie | Tsinghua University |
| Zhang, Xinyu | Tsinghua University |
| Zeng, Long | Tsinghua University |
Keywords: Computer Vision for Manufacturing, Transfer Learning, Deep Learning in Grasping and Manipulation
Abstract: Training with simulated data is a common approach in pose estimation research. However, a sim-to-real gap between clean simulated data and noisy real data will seriously weaken the generalization ability of the algorithm, especially for point clouds. To address this problem, this paper proposes a domain adaptive pose estimation network (DAPE-Net). For the feature extracted from the backbone, the network will conduct the real and simulation discrimination based on a feature discriminator, and complete the pose estimation by adversarial training. This makes the network pay more attention to the domain invariant features of simulation and real point clouds to complete domain adaptation. In our experiment, DAPE-Net improved the performance of pose estimation by 10%. To solve the problem that domain adaptation requires a small amount of real data, we propose a scheme that can semi-automatically collect real data in bin-picking scenarios for 6D pose estimation.
|
| |
| 15:30-17:00, Paper MoBIP-10.4 | Add to My Program |
| Learning Robotic Powder Weighing from Simulation for Laboratory Automation |
|
| Kadokawa, Yuki | Nara Institute of Science and Technology |
| Hamaya, Masashi | OMRON SINIC X Corporation |
| Tanaka, Kazutoshi | OMRON SINIC X Corporation |
Keywords: Robotics and Automation in Life Sciences
Abstract: This study focuses on a robotic powder weighing task used in laboratory automation. In this task, a robot weighs a certain amount of powder with a milligram-level target mass using a dispensing spoon. The complex dynamics of the powder, the variations in the materials being weighed, and the need to balance conservative and aggressive actions are significant challenges in the robotics field. Therefore, learning approaches are critical for this task. However, many learning interactions in real-world environments require substantial efforts to clean the spread powder. To overcome this issue, this study employs a sim-to-real transfer learning approach using a domain randomization (DR) technique. This enables the robot to weigh various powders with a small target mass and alleviates the burden of collecting data in a real-world environment. Herein, we formulated weighing manipulation as a reinforcement learning problem. Besides, we developed a powder weighing simulator and carefully selected the dynamics parameters used for DR to adapt to unseen environments. A recurrent neural network-based policy was adopted considering the balance of conservative and aggressive actions. The sim-to-real zero-shot transfer experiments demonstrated that the robot completed the weighing tasks with an average weighing error of 0.1 -- 0.2 mg for different powder materials and target masses (5 -- 15 mg). Overall, this approach shows promising results and can be useful for automating laboratory tasks that involve weighing powders.
|
| |
| 15:30-17:00, Paper MoBIP-10.5 | Add to My Program |
| Constrained Generative Sampling of 6-DoF Grasps |
|
| Lundell, Jens | Royal Institute of Technology |
| Verdoja, Francesco | Aalto University |
| Nguyen Le, Tran | Aalto University |
| Mousavian, Arsalan | NVIDIA |
| Fox, Dieter | University of Washington |
| Kyrki, Ville | Aalto University |
Keywords: Deep Learning in Grasping and Manipulation, Grasping
Abstract: Most state-of-the-art data-driven grasp sampling methods propose stable and collision-free grasps uniformly on the target object. For bin-picking, executing any of those reachable grasps is sufficient. However, for completing specific tasks, such as squeezing out liquid from a bottle, we want the grasp to be on a specific part of the object�s body while avoiding other locations, such as the cap. This work presents a generative grasp sampling network, VCGS, capable of constrained 6- Degrees of Freedom (DoF) grasp sampling. In addition, we also curate a new dataset designed to train and evaluate methods for constrained grasping. The new dataset, called CONG, consists of over 14 million training samples of synthetically rendered point clouds and grasps at random target areas on 2889 objects. VCGS is benchmarked against GraspNet, a state-of-the-art unconstrained grasp sampler, in simulation and on a real robot. The results demonstrate that VCGS achieves a 10�15% higher grasp success rate than the baseline while being 2�3 times as sample efficient. Supplementary material is available on our project website.
|
| |
| 15:30-17:00, Paper MoBIP-10.6 | Add to My Program |
| RGBD Fusion Grasp Network with Large-Scale Tableware Grasp Dataset |
|
| Yoon, Jaemin | Samsung Research |
| Ahn, Joonmo | Samsung Electronics |
| Ha, Changsu | Samsung Electronics |
| Chung, Rakjoon | Samsung Electronics |
| Park, Dongwoo | Samsung Electronics |
| Han, Heungwoo | Samsung Research |
| Kang, Sung-Chul | Samsung Research, Samsung Electronics |
Keywords: Deep Learning in Grasping and Manipulation, Data Sets for Robot Learning, Grasping
Abstract: This paper proposes a novel approach to address the technical challenges of stable object grasping, particularly in the context of handling tableware in a home environment. Handling tableware is particularly important, yet challenging, due to the flat nature of most tableware objects and the need to maintain a stable posture to prevent spills. To address these challenges, we present three key contributions: 1) a large-scale tableware dataset, not commonly found in the previous datasets; 2) a novel sampling method for stable grasp pose generation; and 3) a multi-modal fusion grasp network that effectively learns 6-DoF grasp pose, including flat objects. Our dataset contains over 45 million grasp poses and 1 million RGBD images captured in 800 scenes, which include randomly selected 10-18 tableware objects under 4 different lighting conditions. The grasp poses in the dataset are generated using a novel sampling method that incorporates geometric analysis to ensure stable grasping with minimal object movement. Furthermore, we design an RGBD fusion grasp network (RGBD-FGN) that can combine information from RGB and depth images considering each characteristic. Our experimental results demonstrate the superior performance of our approach over existing techniques, which is a significant contribution towards developing a multitasking home robot. Our dataset and source code can be accessed at https://github.com/SamsungLabs/RGBD-FGN.
|
| |
| 15:30-17:00, Paper MoBIP-10.7 | Add to My Program |
| One-Shot Affordance Learning (OSAL): Learning to Manipulate Articulated Objects by Observing Once |
|
| Fan, Ruomeng | The University of Tokyo |
| Wang, Taohan | The University of Tokyo School of Engineering |
| Hirano, Masahiro | The University of Tokyo |
| Yamakawa, Yuji | The University of Tokyo |
Keywords: Learning from Demonstration, Deep Learning in Grasping and Manipulation
Abstract: We present One-Shot Affordance Learning (OSAL): a unified pipeline that learns manipulation for articulated objects by observing human demonstration only once. The key idea of our method is to embody affordance of articulated objects with an open-loop trajectory conditioned on a certain area of the object's surface. It serves as a simplified object-centric manipulation representation, which can be easily transferred into robot motion, while traditional methods fail to deal with the configuration difference between human hands and robot end effectors. Our system extracts the embodied affordance by focusing on hand action's effect on the object, and further grounds such affordance into object visual features through self-supervised learning for novel object configurations. We evaluated our method on a collection of real-life objects and furniture and demonstrated high success rates. With our system, humans only need to manipulate a novel object once with any gesture to transfer that manipulation skill to the robot, which we believe to be a highly efficient and user-friendly paradigm oriented for future real-life robots.
|
| |
| 15:30-17:00, Paper MoBIP-10.8 | Add to My Program |
| EARL: Eye-On-Hand Reinforcement Learner for Dynamic Grasping with Active Pose Estimation |
|
| Huang, Baichuan | Rutgers University |
| Yu, Jingjin | Rutgers University |
| Jain, Siddarth | Mitsubishi Electric Research Laboratories (MERL) |
Keywords: Grasping, Perception for Grasping and Manipulation, Reinforcement Learning
Abstract: We explore the dynamic grasping of moving objects through active pose tracking and reinforcement learning for hand-eye coordination systems. Most existing vision-based robotic grasping methods implicitly assume target objects are stationary or moving predictably. Performing grasping of unpredictably moving objects presents a unique set of challenges. For example, a pre-computed robust grasp can become unreachable or unstable as the target object moves, and motion planning must also be adaptive. In this work, we present a new approach, Eye-on-hAnd Reinforcement Learner (EARL), for enabling coupled Eye-on-Hand (EoH) robotic manipulation systems to perform real-time active pose tracking and dynamic grasping of novel objects without explicit motion prediction. EARL readily addresses many thorny issues in automated hand-eye coordination, including fast-tracking of 6D object pose from vision, learning control policy for a robotic arm to track a moving object while keeping the object in the camera�s field of view, and performing dynamic grasping. We demonstrate the effectiveness of our approach in extensive experiments validated on multiple commercial robotic arms in both simulations and complex real-world tasks.
|
| |
| 15:30-17:00, Paper MoBIP-10.9 | Add to My Program |
| KGNv2: Separating Scale and Pose Prediction for Keypoint-Based Grasp Synthesis on RGB-D Input |
|
| Chen, Yiye | Georgia Institute of Technology |
| Xu, Ruinian | Georgia Institute of Technology |
| Lin, Yunzhi | Georgia Institute of Technology |
| Chen, Hongyi | Georgia Institute of Technology |
| Vela, Patricio | Georgia Institute of Technology |
Keywords: Deep Learning in Grasping and Manipulation, Perception for Grasping and Manipulation, Grasping
Abstract: We propose an improved keypoint approach for 6-DoF grasp pose synthesis from RGB-D input. Keypoint-based grasp detection from image input demonstrated promising results in a previous study, where the visual information provided by color imagery compensates for noisy or imprecise depth measurements. However, it relies heavily on accurate keypoint prediction in image space. We devise a new grasp generation network that reduces the dependency on precise keypoint estimation. Given an RGB-D input, the network estimates both the grasp pose and the camera-grasp length scale. Re-design of the keypoint output space mitigates the impact of keypoint prediction noise on Perspective-n-Point (P nP) algorithm solutions. Experiments show that the proposed method outperforms the baseline by a large margin, validating its design. Though trained only on simple synthetic objects, our method demonstrates sim-to-real capacity through competitive results in real-world robot experiments.
|
| |
| 15:30-17:00, Paper MoBIP-10.10 | Add to My Program |
| Learning-Based Real-Time Torque Prediction for Grasping Unknown Objects with a Multi-Fingered Hand |
|
| Winkelbauer, Dominik | DLR |
| B�uml, Berthold | German Aerospace Center (DLR) |
| Triebel, Rudolph | German Aerospace Center (DLR) |
Keywords: Deep Learning in Grasping and Manipulation, Grasping, Multifingered Hands
Abstract: When grasping objects with a multi-finger hand, it is crucial for the grasp stability to apply the correct torques at each joint so that external forces are countered. Most current systems use simple heuristics instead of modeling the required torque correctly. Instead, we propose a learning-based approach that is able to predict torques for grasps on unknown objects in real-time. The neural network, trained end-to-end using supervised learning, is shown to predict torques that are more efficient, and the objects are held with less involuntary movement compared to all tested heuristic baselines. Specifically, for 90 % of the grasps the translational deviation of the object is below 2.9 mm and the rotational below 3.1�. To generate training data, we formulate the analytical computation of torques as an optimization problem and handle the indeterminacy of multi-contacts using an elastic model. We further show that the network generalizes to predict torques for unknown objects on the real robot system with an inference time of 1.5 ms.
|
| |
| 15:30-17:00, Paper MoBIP-10.11 | Add to My Program |
| A Grasp Pose Is All You Need: Learning Multi-Fingered Grasping with Deep Reinforcement Learning from Vision and Touch |
|
| Ceola, Federico | Istituto Italiano Di Tecnologia |
| Maiettini, Elisa | Humanoid Sensing and Perception, Istituto Italiano Di Tecnologia |
| Rosasco, Lorenzo | Istituto Italiano Di Tecnologia & MassachusettsInstitute OfTechn |
| Natale, Lorenzo | Istituto Italiano Di Tecnologia |
Keywords: Grasping, Reinforcement Learning, Humanoid Robot Systems
Abstract: Multi-fingered robotic hands have potential to enable robots to perform sophisticated manipulation tasks. However, teaching a robot to grasp objects with an anthropomorphic hand is an arduous problem due to the high dimensionality of state and action spaces. Deep Reinforcement Learning (DRL) offers techniques to design control policies for this kind of problems without explicit environment or hand modeling. However, state-of-the-art model-free algorithms have proven inefficient for learning such policies. The main problem is that the exploration of the environment is unfeasible for such high-dimensional problems, thus hampering the initial phases of policy optimization. One possibility to address this is to rely on off-line task demonstrations, but, oftentimes, this is too demanding in terms of time and computational resources. To address these problems, we propose the A Grasp Pose is All You Need (G-PAYN) method for the anthropomorphic hand of the iCub humanoid. We develop an approach to automatically collect task demonstrations to initialize the training of the policy. The proposed grasping pipeline starts from a grasp pose generated by an external algorithm, used to initiate the movement. Then a control policy (previously trained with the proposed G-PAYN) is used to reach and grab the object. We deployed the iCub into the MuJoCo simulator and use it to test our approach with objects from the YCB-Video dataset. Results show that G-PAYN outperforms current DRL techniques in the considered setting in terms of success rate and execution time with respect to the baselines. The code to reproduce the experiments is released together with the paper with an open source license.
|
| |
| 15:30-17:00, Paper MoBIP-10.12 | Add to My Program |
| Physics-Informed Learning to Enable Robotic Screw-Driving under Hole Pose Uncertainties |
|
| Manyar, Omey Mohan | University of Southern California |
| Varadanahalli Narayan, Santosh | University of Southern California |
| Lengade, Rohin | University of Southern California |
| Gupta, Satyandra K. | University of Southern California |
Keywords: Learning Categories and Concepts, Compliance and Impedance Control, Industrial Robots
Abstract: Screw-driving is an important operation in numerous applications. In many situations, hole pose cannot be estimated very accurately. Autonomous screw-driving cannot be performed by traditional industrial manipulators in position control mode when the hole pose uncertainty is high. This paper presents a mobile manipulator system for performing autonomous screw-driving in the presence of uncertainties in the hole estimates. It utilizes active compliance in the form of impedance control of the robot and passive compliance in the screwing driving tool to deal with uncertainties. We present a physics-informed machine learning approach to automatically characterize the motion of the screw tip and explain how this motion leads to successful operation in the presence of uncertainty. We also present an approach for detecting failure modes and taking corrective actions. Code and video is available at: https://sites.google.com/usc.edu/physicsinformedscrewdriving
|
| |
| MoBIP-11 Regular session, Hall E |
Add to My Program |
| Clone of 'Aerial Systems - Applications II' |
|
| |
| |
| 15:30-17:00, Paper MoBIP-11.1 | Add to My Program |
| Viewpoint-Driven Formation Control of Airships for Cooperative Target Tracking |
|
| Price, Eric | Universit�t Stuttgart |
| Black, Michael | Max Planck Institute for Intelligent Systems in T�bingen |
| Ahmad, Aamir | University of Stuttgart |
Keywords: Aerial Systems: Perception and Autonomy, Path Planning for Multiple Mobile Robots or Agents, Multi-Robot Systems
Abstract: For tracking and motion capture (MoCap) of animals in their natural habitat, a formation of safe and silent aerial platforms, such as airships with on-board cameras, is well suited. In our prior work we derived formation properties for optimal MoCap, which include maintaining constant angular separation between observers w.r.t. the subject, threshold distance to it and keeping it centered in the camera view. Unlike multi-rotors, airships have non-holonomic constrains and are affected by ambient wind. Their orientation and flight direction are also tightly coupled. Therefore a control scheme for multicopters that assumes independence of motion direction and orientation is not applicable. In this paper, we address this problem by first exploiting a periodic relationship between the airspeed of an airship and its distance to the subject. We use it to derive analytical and numeric solutions that satisfy the formation properties for optimal MoCap. Based on this, we developed an MPC-based formation controller. We performed theoretical analysis of our solution, boundary conditions of its applicability, extensive simulation experiments and a real world demonstration of our control method with an unmanned airship. Open source code https://tinyurl.com/AsMPCCode and video of our method is provided https://tinyurl.com/AsMPCVid .
|
| |
| 15:30-17:00, Paper MoBIP-11.2 | Add to My Program |
| ADMNet: Anti-Drone Real-Time Detection and Monitoring |
|
| Zhou, Xunkuai | Tongji University |
| Yang, Guidong | The Chinese University of Hong Kong |
| Chen, Yizhou | Chinese University of Hong Kong |
| Gao, Chuanxiang | The Chinese University of Hong Kong |
| Zhao, Benyun | The Chinese University of Hong Kong |
| Li, Li | Tongji University |
| Chen, Ben M. | Chinese University of Hong Kong |
Keywords: Computer Vision for Automation, Industrial Robots, Object Detection, Segmentation and Categorization
Abstract: We propose a lightweight, effective, and efficient anti-drone network, namely ADMNet, for visually detecting and monitoring unfriendly drones with a constrained view field, flying against a complex environment. We merge an SPP module to the first head of YOLOv4 to improve accuracy and perform network compression to reduce inference latency and model size. To compensate for the accuracy loss caused by condensation, we propose an SPPS module and a ResNeck module for the neck of the network and implement an effective attention module for the backbone. Eventually, we present an accurate and compact ADMNet with barely 3.9 MB, ensuring low computational cost and real-time detection. Our method achieves state-of-the-art performance on three challenging real-world datasets (Average Precision @0.5IoU): Det-Fly 96.2%, NPS-Drones 92.0%, and TIBNet 89.7%. The throughput is higher than the prior work, in addition to its superior performance. The comparative testing in real-world scenarios proves that our method exhibits strong reliability and generalization ability. Deploying the network on drone onboard edge-computing devices enables real-time detection and monitoring of flying drones, highlighting the portability and viability of the ADMNet.
|
| |
| 15:30-17:00, Paper MoBIP-11.3 | Add to My Program |
| Multi-View Stereo with Learnable Cost Metric |
|
| Yang, Guidong | The Chinese University of Hong Kong |
| Zhou, Xunkuai | Tongji University |
| Gao, Chuanxiang | The Chinese University of Hong Kong |
| Zhao, Benyun | The Chinese University of Hong Kong |
| Zhang, Jihan | Chinese University of Hong Kong |
| Chen, Yizhou | Chinese University of Hong Kong |
| Chen, Xi | The Chinese University of Hong Kong |
| Chen, Ben M. | Chinese University of Hong Kong |
Keywords: Computer Vision for Automation, Aerial Systems: Applications, Deep Learning Methods
Abstract: In this paper, we present LCM-MVSNet, a novel multi-view stereo (MVS) network with learnable cost metric (LCM) for more accurate and complete depth estimation and dense point cloud reconstruction. To adapt to the scene variation and improve the reconstruction quality in non-Lambertian low-textured scenes, we propose LCM to adaptively aggregate multi-view matching similarity into the 3D cost volume by leveraging sparse points hints. The proposed LCM benefits the MVS approaches in four folds, including depth estimation enhancement, reconstruction quality improvement, memory footprint reduction, and computational burden alleviation, allowing the depth inference for high-resolution images to achieve more accurate and complete reconstruction. Moreover, we improve the depth estimation by enhancing the propagation of shallow features via a bottom-up path and strengthen the end-to-end supervision by adapting the focal loss to reduce ambiguity caused by sample imbalance. Extensive experiments on two benchmark datasets show that our network achieves state-of-the-art performance on the DTU dataset and exhibits strong generalization ability with a competitive performance on the Tanks and Temples benchmark. Furthermore, we deploy our LCM-MVSNet into the real-world application for large-scale 3D reconstruction based on multi-view aerial images collected by self-developed UAV, demonstrating the robustness and scalability of our method. More detailed results are available in the Appendix.
|
| |
| 15:30-17:00, Paper MoBIP-11.4 | Add to My Program |
| A Comparison between Framed-Based and Event-Based Cameras for Flapping-Wing Robot Perception |
|
| Tapia, Raul | University of Seville |
| Rodriguez-Gomez, Juan Pablo | University of Seville |
| S�nchez D�az, Juan Antonio | University of Seville |
| Ga��n, Francisco Javier | Universidad De Sevilla |
| Gutierrez Rodriguez, Ivan | University of Seville |
| Luna-Santamaria, Javier | University of Seville |
| Martinez-de Dios, J.R. | University of Seville |
| Ollero, Anibal | AICIA. G41099946 |
Keywords: Aerial Systems: Perception and Autonomy
Abstract: Perception systems for ornithopters face severe challenges. The harsh vibrations and abrupt movements caused during flapping are prone to produce motion blur and strong lighting condition changes. Their strict restrictions in weight, size, and energy consumption also limit the type and number of sensors to mount onboard. Lightweight traditional cameras have become a standard off-the-shelf solution in many flapping-wing designs. However, bioinspired event cameras are a promising solution for ornithopter perception due to their microsecond temporal resolution, high dynamic range, and low power consumption. This paper presents an experimental comparison between frame-based and an event-based camera. Both technologies are analyzed considering the particular flapping-wing robot specifications and also experimentally analyzing the performance of well-known vision algorithms with data recorded onboard a flapping-wing robot. Our results suggest event cameras as the most suitable sensors for ornithopters. Nevertheless, they also evidence the open challenges for event-based vision on board flapping-wing robots.
|
| |
| 15:30-17:00, Paper MoBIP-11.5 | Add to My Program |
| Flexible Multi-DoF Aerial 3D Printing Supported with Automated Optimal Chunking |
|
| Stamatopoulos, Marios-Nektarios | Lule� University of Technology |
| Banerjee, Avijit | Lule� University of Technology |
| Nikolakopoulos, George | Lule� University of Technology |
Keywords: Robotics and Automation in Construction, Additive Manufacturing
Abstract: The future of 3D printing utilizing unmanned aerial vehicles (UAVs) presents a promising capability to revolutionize manufacturing and to enable the creation of large-scale structures in remote and hard-to-reach areas e.g. in other planetary systems. Nevertheless, the limited payload capacity of UAVs and the complexity in the 3D printing of large objects pose significant challenges. In this article we propose a novel chunk-based framework for distributed 3D printing using UAVs that sets the basis for a fully collaborative aerial 3D printing of challenging structures. The presented framework, through a novel proposed optimisation process, is able to divide the 3D model to be printed into small, manageable chunks and to assign them to a UAV for partial printing of the assigned chunk, in a fully autonomous approach. Thus, we establish the algorithms for chunk division, allocation, and printing, and we also introduce a novel algorithm that efficiently partitions the mesh into planar chunks, while accounting for the inter-connectivity constraints of the chunks. The efficiency of the proposed framework is demonstrated through multiple physics based simulations in Gazebo, where a CAD construction mesh is printed via multiple UAVs carrying materials whose volume is proportionate to a fraction of the total mesh volume.
|
| |
| 15:30-17:00, Paper MoBIP-11.6 | Add to My Program |
| Memory Maps for Video Object Detection and Tracking on UAVs |
|
| Kiefer, Benjamin | University of Tuebingen |
| Quan, Yitong | University of Tuebingen |
| Zell, Andreas | University of T�bingen |
Keywords: Aerial Systems: Perception and Autonomy, Data Sets for Robotic Vision, Object Detection, Segmentation and Categorization
Abstract: This paper introduces a novel approach to video object detection detection and tracking on Unmanned Aerial Vehicles (UAVs). By incorporating metadata, the proposed approach creates a memory map of object locations in actual world coordinates, providing a more robust and interpretable representation of object locations in both, image space and the real world. We use this representation to boost confidences, resulting in improved performance for several temporal computer vision tasks, such as video object detection, short and long-term single and multi-object tracking, and video anomaly detection. These findings confirm the benefits of metadata in enhancing the capabilities of UAVs in the field of temporal computer vision and pave the way for further advancements in this area.
|
| |
| 15:30-17:00, Paper MoBIP-11.7 | Add to My Program |
| Robust Localization of Aerial Vehicles Via Active Control of Identical Ground Vehicles |
|
| Spasojevic, Igor | University of Pennsylvania |
| Liu, Xu | University of Pennsylvania |
| Prabhu, Ankit | University of Pennsylvania |
| Ribeiro, Alejandro | University of Pennsylvania |
| Pappas, George J. | University of Pennsylvania |
| Kumar, Vijay | University of Pennsylvania |
Keywords: Aerial Systems: Perception and Autonomy, Planning, Scheduling and Coordination, Localization
Abstract: This paper addresses the problem of active collaborative localization in heterogeneous robot teams with unknown data association. It involves positioning a small number of identical unmanned ground vehicles (UGVs) at desired positions so that an unmanned aerial vehicle (UAV) can, through unlabelled measurements of UGVs, uniquely determine its global pose. We model the problem as a sequential two player game, in which the first player positions the UGVs and the second identifies the two distinct hypothetical poses of the UAV at which the sets of measurements to the UGVs differ by as little as possible. We solve the underlying problem from the vantage point of the first player for a subclass of measurement models using a mixture of local optimization and exhaustive search procedures. Real-world experiments with a team of UAV and UGVs show that our method can achieve centimeter-level global localization accuracy. We also show that our method consistently outperforms random positioning of UGVs by a large margin, with as much as a 90% reduction in position and angular estimation error. Our method can tolerate a significant amount of random as well as non-stochastic measurement noise. This indicates its potential for reliable state estimation on board size, weight, and power (SWaP) constrained UAVs. This work enables robust localization in perceptually-challenged GPS-denied environments, thus paving the road for large-scale multi-robot navigation and mapping.
|
| |
| 15:30-17:00, Paper MoBIP-11.8 | Add to My Program |
| Semantically-Enhanced Deep Collision Prediction for Autonomous Navigation Using Aerial Robots |
|
| Kulkarni, Mihir | NTNU: Norwegian University of Science and Technology |
| Nguyen, Huan | NTNU - Norwegian University of Science and Technology |
| Alexis, Kostas | NTNU - Norwegian University of Science and Technology |
Keywords: Aerial Systems: Perception and Autonomy
Abstract: This paper contributes a novel and modularized learning-based method for aerial robots navigating cluttered environments containing hard-to-perceive thin obstacles without assuming access to a map or the full pose estimation of the robot. The proposed solution builds upon a semantically-enhanced Variational Autoencoder that is trained with both real-world and simulated depth images to compress the input data, while preserving semantically-labeled thin obstacles and handling invalid pixels in the depth sensor's output. This compressed representation, in addition to the robot's partial state involving its linear/angular velocities and its attitude are then utilized to train an uncertainty-aware 3D Collision Prediction Network in simulation to predict collision scores for candidate action sequences in a predefined motion primitives library. A set of simulation and experimental studies in cluttered environments with various sizes and types of obstacles, including multiple hard-to-perceive thin objects, were conducted to evaluate the performance of the proposed method and compare against an end-to-end trained baseline. The results demonstrate the benefits of the proposed semantically-enhanced deep collision prediction for learning-based autonomous navigation.
|
| |
| 15:30-17:00, Paper MoBIP-11.9 | Add to My Program |
| Demonstrating Autonomous 3D Path Planning on a Novel Scalable UGV-UAV Morphing Robot |
|
| Sihite, Eric | California Institute of Technology |
| Slezak, Filip | Caltech |
| Mandralis, Ioannis | Caltech |
| Salagame, Adarsh | Northeastern University |
| Ramezani, Milad | CSIRO |
| Kalantari, Arash | NASA JPL |
| Ramezani, Alireza | Northeastern University |
| Morteza, Gharib | CALTECH |
Keywords: Wheeled Robots, Aerial Systems: Applications, Motion and Path Planning
Abstract: Some animals exhibit multi-modal locomotion capability to traverse a wide range of terrains and environments, such as amphibians that can swim and walk or birds that can fly and walk. This capability is extremely beneficial for expanding the animal's habitat range and they can choose the most energy efficient mode of locomotion in a given environment. The robotic biomimicry of this multi-modal locomotion capability can be very challenging but offer the same advantages. However, the expanded range of locomotion also increases the complexity of performing localization and path planning. In this work, we present our morphing multi-modal robot, which is capable of ground and aerial locomotion, and the implementation of readily available SLAM and path planning solutions to navigate a complex indoor environment.
|
| |
| 15:30-17:00, Paper MoBIP-11.10 | Add to My Program |
| Topology-Guided Perception-Aware Receding Horizon Trajectory Generation for UAVs |
|
| Sun, Gang | Dalian University of Technology |
| Zhang, Xuetao | Dalian University of Technology |
| Liu, Yisha | Dalian Maritime University |
| Wang, Hanzhang | Dalian University of Technology |
| Zhang, Xuebo | Nankai University, |
| Zhuang, Yan | Dalian University of Technology |
Keywords: Motion and Path Planning, Aerial Systems: Applications, Autonomous Vehicle Navigation
Abstract: The perception-aware motion planning method based on the localization uncertainty has the potential to improve the localization accuracy for robot navigation. However, most of the existing perception-aware methods pre-build a global feature map and can not generate the perception-aware trajectory in real time. This paper proposes a topology-guided perception-aware receding horizon trajectory generation method, which contains a topology-guided position trajectory generation and a perception-aware yaw angle trajectory generation. Specifically, a memorable active map is built by selectively storing the visual landmarks. After that, a library of candidate topological trajectories are generated, which are then evaluated in terms of the perception quality based on the active map, smoothness, collision possibility and feasibility. In addition, the yaw angle trajectory is obtained through a front-end multiple refined path search and a back-end path-guided trajectory optimization. Comparative simulation and real-world experiments are carried out to confirm that the proposed method can keep more visual features in the view and reduce the localization error.
|
| |
| 15:30-17:00, Paper MoBIP-11.11 | Add to My Program |
| Learned Inertial Odometry for Autonomous Drone Racing |
|
| Cioffi, Giovanni | University of Zurich |
| Bauersfeld, Leonard | University of Zurich (UZH), |
| Kaufmann, Elia | University of Zurich |
| Scaramuzza, Davide | University of Zurich |
Keywords: Aerial Systems: Perception and Autonomy, Aerial Systems: Applications, Deep Learning Methods
Abstract: Inertial odometry is an attractive solution to the problem of state estimation for agile quadrotor flight. It is inexpensive, lightweight, and it is not affected by perceptual degradation. However, only relying on the integration of the inertial measurements for state estimation is infeasible. The errors and time-varying biases present in such measurements cause the accumulation of large drift in the pose estimates. Recently, inertial odometry has made significant progress in estimating the motion of pedestrians. State-of-the-art algorithms rely on learning a motion prior that is typical of humans but cannot be transferred to drones. In this work, we propose a learning-based odometry algorithm that uses an inertial measurement unit (IMU) as the only sensor modality for autonomous drone racing tasks. The core idea of our system is to couple a model-based filter, driven by the inertial measurements, with a learning-based module that has access to the thrust measurements. We show that our inertial odometry algorithm is superior to the state-of-the-art filter-based and optimization-based visual-inertial odometry as well as the state-of-the-art learned-inertial odometry in estimating the pose of an autonomous racing drone. Additionally, we show that our system is comparable to a visual-inertial odometry solution that uses a camera and exploits the known gate location and appearance. We believe that the application in autonomous drone racing paves the way for novel research in inertial odometry for agile quadrotor flight.
|
| |
| 15:30-17:00, Paper MoBIP-11.12 | Add to My Program |
| Nonlinear Deterministic Observer for Inertial Navigation Using Ultra-Wideband and IMU Sensor Fusion |
|
| Hashim, Hashim A. | Carleton University |
| E. E. Eltoukhy, Abdelrahman | The Hong Kong Polytechnic University |
| Vamvoudakis, Kyriakos G. | Georgia Inst. of Tech |
| Abouheaf, Mohammed | University of Ottawa |
Keywords: Aerial Systems: Perception and Autonomy, SLAM, Optimization and Optimal Control
Abstract: Navigation in Global Positioning Systems (GPS)-denied environments requires robust estimators reliant on fusion of inertial sensors able to estimate rigid-body's orientation, position, and linear velocity. Ultra-wideband (UWB) and Inertial Measurement Unit (IMU) represent low-cost measurement technology that can be utilized for successful Inertial Navigation. This paper presents a nonlinear deterministic navigation observer in a continuous form that directly employs UWB and IMU measurements. The estimator is developed on the extended Special Euclidean Group mathbb{SE}_{2}left(3right) and ensures exponential convergence of the closed loop error signals starting from almost any initial condition. The discrete version of the proposed observer is tested using a publicly available real-world dataset of a drone flight.
|
| |
| 15:30-17:00, Paper MoBIP-11.13 | Add to My Program |
| Precision Post-Stall Landing Using NMPC with Learned Aerodynamics |
|
| Basescu, Max | Johns Hopkins University Applied Physics Lab |
| Yeh, Bryanna | The Johns Hopkins University Applied Physics Laboratory |
| Scheuer, Luca | Johns Hopkins University Applied Physics Lab |
| Wolfe, Kevin | Johns Hopkins University Applied Physics Laboratory |
| Moore, Joseph | Johns Hopkins University Applied Physics Lab |
Keywords: Aerial Systems: Perception and Autonomy, Field Robots, Aerial Systems: Applications
Abstract: In this paper, we present an approach for achieving precision post-stall landings with medium-sized Group 1 Unmanned Aerial Systems (UAS). To do this, we employ an aggressive dive-and-stall maneuver to significantly reduce landing distance, time, and touchdown speed. Our ultimate approach relies on a nonlinear model predictive control (NMPC) algorithm and learned aerodynamic coefficients to achieve accuracy and reliability in the presence of wind disturbances. We demonstrate our approach in hardware with a 60-inch wingspan, 4.2 kg fixed- wing UAS, and show the ability to land with low speed and high accuracy using minimal throttle.
|
| |
| 15:30-17:00, Paper MoBIP-11.14 | Add to My Program |
| Cascaded Denoising Transformer for UAV Nighttime Tracking |
|
| Lu, Kunhan | Tongji University |
| Fu, Changhong | Tongji University |
| Wang, Yucheng | Tongji University |
| Zuo, Haobo | Tongji University |
| Zheng, Guangze | Tongji University |
| Pan, Jia | University of Hong Kong |
Keywords: Aerial Systems: Perception and Autonomy, Aerial Systems: Applications, Deep Learning for Visual Perception
Abstract: The automation of unmanned aerial vehicle (UAV) has been greatly promoted by visual object tracking methods with onboard cameras. However, the random and complicated noise produced by the cameras seriously hinders the performance of state-of-the-art (SOTA) UAV trackers, especially in low-illumination environments. To address this issue, this work proposes an efficient plug-and-play cascaded denoising Transformer (CDT) to suppress cluttered and complex noise, thereby boosting UAV tracking performance. Specifically, the novel U-shaped cascaded denoising network is designed with a streamlined structure for efficient computation. Additionally, shallow feature deepening (SFD) encoder and multi-feature collaboration (MFC) decoder are constructed based on multi-head transposed self-attention (MTSA) and multi-head transposed cross-attention (MTCA), respectively. A nested residual feed-forward network (NRFN) is developed to focus more on high-frequency information represented by noise. Extensive evaluation and test experiments demonstrate that the proposed CDT has a remarkable denoising effect and improves UAV nighttime tracking performance. The source code, pre-trained models, and experimental results are available at https://github.com/vision4robotics/CDT.
|
| |
| MoBIP-12 Regular session, Hall E |
Add to My Program |
| Clone of 'Perception for Grasping and Manipulation II' |
|
| |
| |
| 15:30-17:00, Paper MoBIP-12.1 | Add to My Program |
| Model-Free Grasping with Multi-Suction Cup Grippers for Robotic Bin Picking |
|
| Schillinger, Philipp | Bosch Center for Artificial Intelligence |
| Gabriel, Miroslav | Bosch Center for Artificial Intelligence |
| Kuss, Alexander | Robert Bosch GmbH, Corporate Sector Research and Advance Enginee |
| Ziesche, Hanna | Bosch BCAI |
| Anh Vien, Ngo | Bosch GmbH |
Keywords: Perception for Grasping and Manipulation, Computer Vision for Automation, Industrial Robots
Abstract: This paper presents a novel method for model-free prediction of grasp poses for suction grippers with multiple suction cups. Our approach is agnostic to the design of the gripper and does not require gripper-specific training data. In particular, we propose a two-step approach, where first, a neural network predicts pixel-wise grasp quality for an input image to indicate areas that are generally graspable. Second, an optimization step determines the optimal gripper selection and corresponding grasp poses based on configured gripper layouts and activation schemes. In addition, we introduce a method for automated labeling for supervised training of the grasp quality network. Experimental evaluations on a real-world industrial application with bin picking scenes of varying difficulty demonstrate the effectiveness of our method.
|
| |
| 15:30-17:00, Paper MoBIP-12.2 | Add to My Program |
| Vision-Based State and Pose Estimation for Robotic Bin Picking of Cables |
|
| Monguzzi, Andrea | Politecnico Di Milano |
| Cella, Christian | Politecnico Di Milano |
| Zanchettin, Andrea Maria | Politecnico Di Milano |
| Rocco, Paolo | Politecnico Di Milano |
Keywords: Perception for Grasping and Manipulation, Dual Arm Manipulation, Industrial Robots
Abstract: This paper deals with the challenging task of picking semi-deformable linear objects (SDLOs) from a bin. SDLOs are deformable elements, such as cables, joined to a rigid part as a connector. We propose a vision-based strategy to detect, classify and estimate the pose and the state (free or occluded) of connectors belonging to an unspecified number of SDLOs, arranged in an unknown configuration in the bin. The connectors can then be grasped and manipulated by a dual-arm robot through a set of manipulation primitives. In this way, a single SDLO can be extracted from the bin and laid on the worktable. A subsequent association between the connectors and the extracted SDLOs is performed, allowing to firmly grasp a SDLO at its ends to further manipulate it. The procedure is tested in bin picking operations with several kinds of SDLOs and is applied to a use case involving a collaborative wire harnesses assembly task.
|
| |
| 15:30-17:00, Paper MoBIP-12.3 | Add to My Program |
| Efficient Visuo-Haptic Object Shape Completion for Robot Manipulation |
|
| Rustler, Lukas | Ceske Vysoke Uceni Technicke V Praze, FEL |
| Matas, Jiri | Czech Technical University |
| Hoffmann, Matej | Czech Technical University in Prague, Faculty of Electrical Engi |
Keywords: Perception for Grasping and Manipulation, Force and Tactile Sensing, RGB-D Perception
Abstract: For robot manipulation, a complete and accurate object shape is desirable. Here, we present a method that combines visual and haptic reconstruction in a closed-loop pipeline. From an initial viewpoint, the object shape is reconstructed using an implicit surface deep neural network. The location with highest uncertainty is selected for haptic exploration, the object is touched, the new information from touch and a new point cloud from the camera are added, object position is re-estimated and the cycle is repeated. We extend Rustler et al. (2022) by using a new theoretically grounded method to determine the points with highest uncertainty, and we increase the yield of every haptic exploration by adding not only the contact points to the point cloud but also incorporating the empty space established through the robot movement to the object. Additionally, the solution is compact in that the jaws of a closed two-finger gripper are directly used for exploration. The object position is re-estimated after every robot action and multiple objects can be present simultaneously on the table. We achieve a steady improvement with every touch using three different metrics and demonstrate the utility of the better shape reconstruction in grasping experiments on the real robot. On average, grasp success rate increases from 63.3% to 70.4% after a single exploratory touch and to 82.7% after five touches. The collected data are publicly available at https://osf.io/j6rkd/ and code at https://github.com/ctu-vras/vishac.
|
| |
| 15:30-17:00, Paper MoBIP-12.4 | Add to My Program |
| Force Map: Learning to Predict Contact Force Distribution from Vision |
|
| Hanai, Ryo | National Institute of Industrial Science and Technology(AIST) |
| Domae, Yukiyasu | The National Institute of Advanced Industrial Science and Techno |
| Ramirez-Alpizar, Ixchel Georgina | National Institute of Advanced Industrial Science and Technology |
| Leme, Bruno | University of Florida |
| Ogata, Tetsuya | Waseda University |
Keywords: Perception for Grasping and Manipulation, Visual Learning, Force and Tactile Sensing
Abstract: When humans see a scene, they can roughly imagine the forces applied to objects based on their experience and use them to handle the objects properly. This paper considers transferring this �force-visualization� ability to robots. We hypothesize that a rough force distribution (named �force map�) can be utilized for object manipulation strategies even if accurate force estimation is impossible. Based on this hypothesis, we propose a training method to predict the force map from vision. To investigate this hypothesis, we generated scenes where objects were stacked in bulk through simulation and trained a model to predict the contact force from a single image. We further applied domain randomization to make the trained model function on real images. The experimental results showed that the model trained using only synthetic images could predict approximate patterns representing the contact areas of the objects even for real images. Then, we designed a simple algorithm to plan a lifting direction using the predicted force distribution. We confirmed that using the predicted force distribution contributes to finding natural lifting directions for typical real-world scenes. Furthermore, the evaluation through simulations showed that the disturbance caused to surrounding objects was reduced by 26 % (translation displacement) and by 39 % (angular displacement) for scenes where objects were overlapping.
|
| |
| 15:30-17:00, Paper MoBIP-12.5 | Add to My Program |
| Push to Know! - Visuo-Tactile Based Active Object Parameter Inference with Dual Differentiable Filtering |
|
| Dutta, Anirvan | BMW Group and Imperial College London |
| Burdet, Etienne | Imperial College London |
| Kaboli, Mohsen | BMW Group |
Keywords: Perception for Grasping and Manipulation, Force and Tactile Sensing
Abstract: For robotic systems to interact with objects in dynamic environments, it is essential to perceive the physical properties of the objects such as shape, friction coefficient, mass, center of mass, and inertia. This not only eases selecting manipulation action but also ensures the task is performed as desired. However, estimating the physical properties of especially novel objects is a challenging problem, using either vision or tactile sensing. In this work, we propose a novel framework to estimate key object parameters using non-prehensile manipulation using vision and tactile sensing. Our proposed active dual differentiable filtering (ADDF) approach as part of our framework learns the object-robot interaction during non-prehensile object push to infer the object's parameters. Our proposed method enables the robotic system to employ vision and tactile information to interactively explore a novel object via non-prehensile object push. The novel proposed N-step active formulation within the differentiable filtering facilitates efficient learning of the object-robot interaction model and during inference by selecting the next best exploratory push actions (where to push? and how to push?). We extensively evaluated our framework in simulation and real-robotic scenarios, yielding superior performance to the state-of-the-art baseline.
|
| |
| 15:30-17:00, Paper MoBIP-12.6 | Add to My Program |
| IOSG: Image-Driven Object Searching and Grasping |
|
| Yu, Houjian | University of Minnesota, Twin Cities |
| Lou, Xibai | University of Minnesota Twin Cities |
| Yang, Yang | University of Minnesota |
| Choi, Changhyun | University of Minnesota, Twin Cities |
Keywords: Perception-Action Coupling, Perception for Grasping and Manipulation, Deep Learning in Grasping and Manipulation
Abstract: When robots retrieve specific objects from cluttered scenes, such as home and warehouse environments, the target objects are often partially occluded or completely hidden. Robots are thus required to search, identify a target object, and successfully grasp it. Preceding works have relied on pre-trained object recognition or segmentation models to find the target object. However, such methods require laborious manual annotations to train the models and even fail to find novel target objects. In this paper, we propose an Image-driven Object Searching and Grasping (IOSG) approach where a robot is provided with the reference image of a novel target object and tasked to find and retrieve it. We design a Target Similarity Network that generates a probability map to infer the location of the novel target. IOSG learns a hierarchical policy; the high-level policy predicts the subtask type, whereas the low-level policies, explorer and coordinator, generate effective push and grasp actions. The explorer is responsible for searching the target object when it is hidden or occluded by other objects. Once the target object is found, the coordinator conducts target-oriented pushing and grasping to retrieve the target from the clutter. The proposed pipeline is trained with full self-supervision in simulation and applied to a real environment. Our model achieves a 96.0% and 94.5% task success rate on coordination and exploration tasks in simulation respectively, and 85.0% success rate on a real robot for the search-and-grasp task. Please refer to our project page for more information: https://z.umn.edu/iosg.
|
| |
| 15:30-17:00, Paper MoBIP-12.7 | Add to My Program |
| DexRepNet: Learning Dexterous Robotic Grasping Network with Geometric and Spatial Hand-Object Representation |
|
| Qingtao, Liu | Zhejiang University |
| Cui, Yu | Zhejiang University |
| Ye, Qi | Zhejiang University |
| Sun, Zhengnan | Zhejiang University |
| Li, Haoming | Zhejiang University |
| Li, Gaofeng | Zhejiang University |
| Shao, Lin | National University of Singapore |
| Chen, Jiming | Zhejiang University |
Keywords: Perception for Grasping and Manipulation, Grasping, Multifingered Hands
Abstract: Robotic dexterous grasping is a challenging problem due to the high degree of freedom (DoF) and complex contacts of multi-fingered robotic hands. Existing deep reinforcement learning (DRL) based methods leverage human demonstrations to reduce sample complexity due to the high dimensional action space with dexterous grasping. However, less attention has been paid to hand-object interaction representations for high-level generalization. In this paper, we propose a novel geometric and spatial hand-object interaction representation, named DexRep, to capture object surface features and the spatial relations between hands and objects during grasping. DexRep comprises Occupancy Feature for rough shapes within sensing range by moving hands, Surface Feature for changing hand-object surface distances, and Local-Geo Feature for local geometric surface features most related to potential contacts. Based on the new representation, we propose a dexterous deep reinforcement learning method DexRepNet to learn a generalizable grasping policy. Experimental results show that our method outperforms baselines using existing representations for robotic grasping dramatically both in grasp success rate and convergence speed. It achieves a 93% grasping success rate on seen objects and higher than 80% grasping success rates on diverse objects of unseen categories in both simulation and real-world experiments.
|
| |
| 15:30-17:00, Paper MoBIP-12.8 | Add to My Program |
| Active Acoustic Sensing for Robot Manipulation |
|
| Lu, Shihan | University of Southern California |
| Culbertson, Heather | University of Southern California |
Keywords: Perception for Grasping and Manipulation, Force and Tactile Sensing, Grasping
Abstract: Perception in robot manipulation has been actively explored with the goal of advancing and integrating vision and touch for global and local feature extraction. However, it is difficult to perceive certain object internal states, and the integration of visual and haptic perception is not compact and is easily biased. We propose to address these limitations by developing an active acoustic sensing method for robot manipulation. Active acoustic sensing relies on the resonant properties of the object, which are related to its material, shape, internal structure, and contact interactions with the gripper and environment. The sensor consists of a vibration actuator paired with a piezo-electric microphone. The actuator generates a waveform, and the microphone tracks the waveform's propagation and distortion as it travels through the object. This paper presents the sensing principles, hardware design, simulation development, and evaluation of physical and simulated sensory data under different conditions as a proof-of-concept. This work aims to provide fundamentals on a useful tool for downstream robot manipulation tasks using active acoustic sensing, such as object recognition, grasping point estimation, object pose estimation, and external contact formation detection.
|
| |
| 15:30-17:00, Paper MoBIP-12.9 | Add to My Program |
| Grasp Region Exploration for 7-DoF Robotic Grasping in Cluttered Scenes |
|
| Chen, Zibo | Sun Yat-Sen University |
| Liu, Zhixuan | Sun Yat-Sen University |
| Xie, Shangjin | Sun Yat-Sen University |
| Zheng, Wei-Shi | Sun Yat-Sen University |
Keywords: Perception for Grasping and Manipulation
Abstract: Robotic grasping is a fundamental skill for robots, but it is quite challenging in cluttered scenes. In cluttered scenes, the precise prediction of high-quality grasp configurations such as rotation and grasping width while avoiding collisions is essential. To accomplish this, the grasp detection models require the capabilities of stronger fine-grained information extracted around the grasp points. However, due to the computational resource restriction, point clouds are usually downsampled in existing networks, which inevitably make some potentially important points discarded. To overcome this problem, we propose a Grasp Region Exploration module to explore the area covered by high-quality grasps. Based on the grasp region, we enhance the point density around the grasp points to mitigate the loss of information caused by downsampling. Furthermore, we devise the Grasp Region Attention module to dynamically aggregate features of various points within the grasp region, such as the grasp point and contact points. The proposed method achieves state-of-the-art performance on the large-scale GraspNet-1Billion dataset. We also conduct real-world experiments on a Franka Emika Panda robot and show that the robot can grasp objects in cluttered scenes with a high success rate.
|
| |
| 15:30-17:00, Paper MoBIP-12.10 | Add to My Program |
| Bagging by Learning to Singulate Layers Using Interactive Perception |
|
| Chen, Lawrence Yunliang | UC Berkeley |
| Shi, Baiyu | UC Berkeley |
| Lin, Roy | University of California, Berkeley |
| Seita, Daniel | Carnegie Mellon University |
| Ahmad, Ayah | University of California, Berkeley |
| Cheng, Richard | California Institute of Technology |
| Kollar, Thomas | Toyota Research Institute |
| Held, David | Carnegie Mellon University |
| Goldberg, Ken | UC Berkeley |
Keywords: Perception for Grasping and Manipulation, Bimanual Manipulation, Deep Learning in Grasping and Manipulation
Abstract: Many fabric handling and 2D deformable material tasks in homes and industry require singulating layers of material such as opening a bag or arranging garments for sewing. In contrast to methods requiring specialized sensing or end effectors, we use only visual observations with ordinary parallel jaw grippers. We propose SLIP: Singulating Layers using Interactive Perception, and apply SLIP to the task of autonomous bagging. We develop SLIP-Bagging, a bagging algorithm that manipulates a plastic or fabric bag from an unstructured state, and uses SLIP to grasp the top layer of the bag to open it for object insertion. In physical experiments, a YuMi robot achieves a success rate of 67% to 81% across bags of a variety of materials, shapes, and sizes, significantly improving in success rate and generality over prior work. Experiments also suggest that SLIP can be applied to tasks such as singulating layers of folded cloth and garments. Supplementary material is available at https://sites.google.com/view/slip-bagging/.
|
| |
| 15:30-17:00, Paper MoBIP-12.11 | Add to My Program |
| Simultaneous Multi-Object 3D Shape Reconstruction, 6DoF Pose Estimation and Dense Grasp Prediction |
|
| Agrawal, Shubham | Samsung Research America |
| Chavan-Dafle, Nikhil | Samsung Research America |
| Kasahara, Isaac | Samsung Research America |
| Engin, Kazim Selim | University of Minnesota |
| Huh, Jinwook | Samsung |
| Isler, Volkan | University of Minnesota |
Keywords: Perception for Grasping and Manipulation, Deep Learning in Grasping and Manipulation, Grasping
Abstract: In this paper, we present a real-time method for simultaneous object-level scene understanding and grasp prediction. Specifically, given a single RGBD image of a scene, our method localizes all the objects in the scene and for each object, it generates the following: full 3D shape, scale, pose with respect to the camera frame, and a dense set of feasible grasps. The main advantage of our method is its computation speed as it avoids sequential perception and grasp planning. With detailed quantitative analysis of reconstruction quality and grasp accuracy, we show that our method delivers competitive performance compared to the state-of-the-art methods, while providing fast inference at 30 frames per second speed.
|
| |
| 15:30-17:00, Paper MoBIP-12.12 | Add to My Program |
| Flexible Handover with Real-Time Robust Dynamic Grasp Trajectory Generation |
|
| Zhang, Gu | Shanghai Jiaotong University |
| Fang, Hao-Shu | Shanghai Jiao Tong University |
| Fang, Hongjie | Shanghai Jiao Tong University |
| Lu, Cewu | ShangHai Jiao Tong University |
Keywords: Perception for Grasping and Manipulation, Human-Robot Collaboration, Grasping
Abstract: In recent years, there has been a significant effort dedicated to developing efficient, robust, and general human-to-robot handover systems. However, the area of flexible handover in the context of complex and continuous objects' motion remains relatively unexplored. In this work, we propose an approach for effective and robust flexible handover, which enables the robot to grasp moving objects with flexible motion trajectories with a high success rate. The key innovation of our approach is the generation of real-time robust grasp trajectories. We also design a future grasp prediction algorithm to enhance the system's adaptability to dynamic handover scenes. We conduct one-motion handover experiments and motion-continuous handover experiments on our novel benchmark that includes 31 diverse household objects. The system we have developed allows users to move and rotate objects in their hands within a relatively large range. The success rate of the robot grasping such moving objects is 78.15% over the entire household object benchmark.
|
| |
| MoBIP-13 Regular session, Hall E |
Add to My Program |
| Clone of 'Computer Vision for Automation' |
|
| |
| |
| 15:30-17:00, Paper MoBIP-13.1 | Add to My Program |
| NeurAR: Neural Uncertainty for Autonomous 3D Reconstruction with Implicit Neural Representations |
|
| Ran, Yunlong | Zhejiang University |
| Zeng, Jing | Zhejiang University |
| He, Shibo | Zhejiang University |
| Chen, Jiming | Zhejiang University |
| Li, Lincheng | NetEase Fuxi AI Lab |
| Chen, Yingfeng | Netease Inc |
| Lee, Gim Hee | National University of Singapore |
| Ye, Qi | Zhejiang University |
Keywords: Computer Vision for Automation, Motion and Path Planning, Planning under Uncertainty
Abstract: Implicit neural representations have shown compelling results in offline 3D reconstruction and also recently demonstrated the potential for online SLAM systems. However, applying them to autonomous 3D reconstruction, where a robot is required to explore a scene and plan a view path for the reconstruction, has not been studied. In this paper, we explore for the first time the possibility of using implicit neural representations for autonomous 3D scene reconstruction by addressing two key challenges: 1) seeking a criterion to measure the quality of the candidate viewpoints for the view planning based on the new representations, and 2) learning the criterion from data that can generalize to different scenes instead of a hand-crafting one. To solve the challenges, firstly, a proxy of Peak Signal-to-Noise Ratio (PSNR) is proposed to quantify a viewpoint quality; secondly, the proxy is optimized jointly with the parameters of an implicit neural network for the scene. With the proposed view quality criterion from neural networks (termed as Neural Uncertainty), we can then apply implicit representations to autonomous 3D reconstruction. Our method demonstrates significant improvements on various metrics for the rendered image quality and the geometry quality of the reconstructed 3D models when compared with variants using TSDF or reconstruction without view planning.
|
| |
| 15:30-17:00, Paper MoBIP-13.2 | Add to My Program |
| HyperTraj: Towards Simple and Fast Scene-Compliant Endpoint Conditioned Trajectory Prediction |
|
| Huang, Renhao | University of New South Wales |
| Pagnucco, Maurice | University of New South Wales |
| Song, Yang | University of New South Wales |
Keywords: Computer Vision for Automation, Vision-Based Navigation, Intention Recognition
Abstract: An important task in trajectory prediction is to model the uncertainty of agents' motions, which requires the system to propose multiple plausible future trajectories for agents based on their past movements. Recently, many approaches have been developed following an endpoint-conditioned deep learning framework by firstly predicting the distribution of endpoints, then sampling endpoints from it and finally completing their waypoints. However, this framework suffers a severe efficiency issue as it needs to repeatedly execute a separate decoder conditioned on multiple sampled endpoints. In this work, we propose a simple and fast endpoint conditioned fully convolutional trajectory prediction framework, called HyperTraj, by using dynamic convolutions to generate multiple trajectories, with the main benefits that (1) our prediction is conditioned on endpoint but takes almost constant time when the number of goals increases and (2) our model benefits from convolutional based predictions, such as the acceptance of various scene sizes and better modeling of agent-scene interactions. In our experiment, our model shows comparable or even better accuracy than our state-of-the-art baselines on SDD and VIRAT datasets with around 84% of acceleration and 90% model weight reduction for waypoint decoding.
|
| |
| 15:30-17:00, Paper MoBIP-13.3 | Add to My Program |
| PanelPose: A 6D Pose Estimation of Highly-Variable Panel Object for Robotic Robust Cockpit Panel Inspection |
|
| Sun, Han | Shanghai Jiao Tong UNIVERSITY |
| Ni, Peiyuan | National University of Singapore |
| Li, Zhiqi | Shanghai Jiao Tong UNIVERSITY |
| Wang, Yizhao | SJTU |
| Zhu, Xiaoxiao | SJTU |
| Cao, Qixin | Shanghai Jiao Tong University |
Keywords: Computer Vision for Automation, Industrial Robots, Recognition
Abstract: In robotic cockpit inspection scenarios, the 6D pose of highly-variable panel objects is necessary. However, the buttons with different states on the panel cause the variable texture and point cloud, which confuses the traditional invariable object pose estimation method. The bottleneck is the variable texture and point cloud. To address this issue, we propose a simple yet effective method denoted as PanelPose that leverages synthetic data and edge-line features. Specifically, we extract edge and line features of RGB images and fuse these feature maps as a multi-feature fusion map (MFF Map) to focus on the shape features of panel objects. Moreover, we design an effective keypoint selection algorithm considering the shape information of panel objects, which simplifies keypoint localization for precise pose estimation. Finally, the panel object pose is estimated via PNP/RANSAC, refined by the multistate template (MST) and multi-scale ICP. We experimentally show that state-of-the-art 6D pose estimation methods alone are not sufficient to solve the cockpit panel inspection task but that our method significantly improves the performance. In cockpit inspection scenarios, the panel localization error is less than 3mm using our method. Code and data are available at https://github.com/sunhan1997/PanelPose.
|
| |
| 15:30-17:00, Paper MoBIP-13.4 | Add to My Program |
| Image Restoration Via UAVFormer for Under-Display Camera of UAV |
|
| Zheng, Zhuoran | Nanjing University of Science and Technology |
| Jia, Xiuyi | Nanjing University of Science and Technology |
Keywords: Computer Vision for Automation, Computer Vision for Manufacturing, Computer Vision for Transportation
Abstract: The exposed cameras of UAVs can shake, shift, or even malfunction under the influence of harsh weather, while the add-on devices (Dupont lines) are very vulnerable to damage. Although we can place a low-cost transparent film overlay around the camera to protect it, this would also introduce image degradation issues (such as oversaturation, astigmatism, etc). To tackle the image degradation problem caused by overlaying transparent film, in this paper we propose a novel method to enhance the visual experience by adapting a deep network with UAV characteristics. Specifically, we first develop a stabilizer to filter the input images which avoids blurred imaging due to the shaking of the drone hardware. Then, we propose a customized Transformer named UAVFormer recover the image, which has a key module at each stage based on the Swin Transformer with local awareness (LAT). Finally, we use an evidential fusion algorithm to integrate the generated images at each stage to obtain a high-quality result. Furthermore, we create a high-resolution under-display camera dataset to support the training and testing of compared models. Our model can conduct high-quality recovery of images of 2K resolution on some embedded devices (Raspberry Pi 4b) in real time.
|
| |
| 15:30-17:00, Paper MoBIP-13.5 | Add to My Program |
| Semantic Scene Difference Detection in Daily Life Patroling by Mobile Robots Using Pre-Trained Large-Scale Vision-Language Model |
|
| Obinata, Yoshiki | The University of Tokyo |
| Kawaharazuka, Kento | The University of Tokyo |
| Kanazawa, Naoaki | The University of Tokyo |
| Yamaguchi, Naoya | The University of Tokyo |
| Tsukamoto, Naoto | The University of Tokyo |
| Yanokura, Iori | University of Tokyo |
| Kitagawa, Shingo | The University of Tokyo |
| Shinjo, Koki | The University of Tokyo |
| Okada, Kei | The University of Tokyo |
| Inaba, Masayuki | The University of Tokyo |
Keywords: Environment Monitoring and Management, Computer Vision for Automation, Recognition
Abstract: It is important for daily life support robots to detect changes in their environment and perform tasks. In the field of anomaly detection in computer vision, probabilistic and deep learning methods have been used to calculate the image distance. These methods calculate distances by focusing on image pixels. In contrast, this study aims to detect semantic changes in the daily life environment using the current development of large-scale vision-language models. Using its Visual Question Answering (VQA) model, we propose a method to detect semantic changes by applying multiple questions to a reference image and a current image and obtaining answers in the form of sentences. Unlike deep learning-based methods in anomaly detection, this method does not require any training or fine-tuning, is not affected by noise, and is sensitive to semantic state changes in the real world. In our experiments, we demonstrated the effectiveness of this method by applying it to a patrol task in a real-life environment using a mobile robot, Fetch Mobile Manipulator. In the future, it may be possible to add explanatory power to changes in the daily life environment through spoken language.
|
| |
| 15:30-17:00, Paper MoBIP-13.6 | Add to My Program |
| Seeing the Fruit for the Leaves: Robotically Mapping Apple Fruitlets in a Commercial Orchard |
|
| Qureshi, Ans | University of Auckland |
| Smith, David | University of Auckland |
| Gee, Trevor | The University of Auckland |
| Nejati, Mahla | The University of Auckland |
| Shahabi, Jalil | University of Auckland |
| Lim, JongYoon | University of Auckland |
| Ahn, Ho Seok | The University of Auckland, Auckland |
| McGuinness, Benjamin John | University of Waikato |
| Downes, Catherine | University of Waikato |
| Jangali, Rahul | The University of Waikato |
| Black, Kale | Black Box Technologies LTD |
| Lim, Shen Hin | University of Waikato |
| Duke, Mike | Waikato University |
| MacDonald, Bruce | University of Auckland |
| Williams, Henry | University of Auckland |
Keywords: Robotics and Automation in Agriculture and Forestry, Computer Vision for Automation, Agricultural Automation
Abstract: Aotearoa New Zealand has a strong and growing apple industry but struggles to access workers to complete skilled, seasonal tasks such as thinning. To ensure effective thinning and make informed decisions on a per-tree basis, it is crucial to accurately measure the crop load of individual apple trees. However, this task poses challenges due to the dense foliage that hides the fruitlets within the tree structure. In this paper, we introduce the vision system of an automated apple fruitlet thinning robot, developed to tackle the labor shortage issue. This paper presents the initial design, implementation, and evaluation specifics of the system. The platform straddles the 3.4 m tall 2D apple canopy structures to create an accurate map of the fruitlets on each tree. We show that this platform can measure the fruitlet load on an apple tree by scanning through both sides of the branch. The requirement of an overarching platform was justified since two-sided scans had a higher counting accuracy of 81.17 % than one-sided scans at 73.7 %. The system was also demonstrated to produce size estimates within 5.9% RMSE of their true size.
|
| |
| 15:30-17:00, Paper MoBIP-13.7 | Add to My Program |
| Cross-Domain Autonomous Driving Perception Using Contrastive Appearance Adaptation |
|
| Zheng, Ziqiang | Hong Kong University of Science and Technology |
| Chen, Yingshu | HKUST |
| Hua, Binh-Son | VinAI |
| Wu, Yang | Tencent |
| Yeung, Sai-Kit | Hong Kong University of Science and Technology |
Keywords: Computer Vision for Automation, Object Detection, Segmentation and Categorization, Autonomous Vehicle Navigation
Abstract: Addressing domain shifts for complex perception tasks in autonomous driving has long been a challenging problem. In this paper, we show that existing domain adaptation methods pay little attention to the textit{content mismatch} issue between source and target domains, thus weakening the domain adaptation performance and the decoupling of domain-invariant and domain-specific representations. To solve the aforementioned problems, we propose an image-level domain adaptation framework that aims at adapting source-domain images to the target domain with content-aligned source-target image pairs. Our framework consists of three mutually beneficial modules in a cycle: a textit{cross-domain content alignment} module to generate source-target pairs with consistent content representations in a self-supervised manner, textit{a reference-guided image synthesis} based on the generated content-aligned source-target image pairs, and a textit{contrastive learning} module to self-supervise domain-invariant feature extractor. Our contrastive appearance adaptation is task-agnostic and robust to complex perception tasks in autonomous driving. Our proposed method demonstrates state-of-the-art results in cross-domain object detection, semantic segmentation, and depth estimation as well as better image synthesis ability qualitatively and quantitatively.
|
| |
| 15:30-17:00, Paper MoBIP-13.8 | Add to My Program |
| MENTOR: Multilingual tExt detectioN TOward leaRning by Analogy |
|
| Lin, Hsin-Ju | National Yang Ming Chiao Tung University |
| Chung, Tsu-Chun | National Yang Ming Chiao Tung University |
| Hsiao, Ching-chun | National Yang Ming Chiao Tung University |
| Chen, Pin-Yu | IBM Research |
| Chiu, Wei-Chen | National Chiao Tung University |
| Huang, Ching-Chun | National Chiao Tung University |
Keywords: Computer Vision for Automation, Recognition, Semantic Scene Understanding
Abstract: Text detection is frequently used in vision-based mobile robots when they need to interpret texts in their surroundings to perform a given task. For instance, delivery robots in multilingual cities need to be capable of doing multilingual text detection so that the robots can read traffic signs and road markings. Moreover, the target languages change from region to region, implying the need of efficiently re-training the models to recognize the novel/new languages. However, collecting and labeling training data for novel languages are cumbersome, and the efforts to re-train an existing/trained text detector are considerable. Even worse, such a routine would repeat whenever a novel language appears. This motivates us to propose a new problem setting for tackling the aforementioned challenges in a more efficient way: ``We ask for a generalizable multilingual text detection framework to detect and identify seen and unseen language regions inside scene images without the requirement of collecting supervised training data for unseen languages as well as model re-training''. To this end, we propose ``MENTOR'', the first work to realize a learning strategy between zero-shot learning and few-shot learning for multilingual scene text detection. During the training phase, we leverage the ``zero-cost'' synthesized printed texts and the available training/seen languages to learn the meta-mapping from printed texts to language-specific kernel weights. Meanwhile, dynamic convolution networks guided by the language-specific kernel are trained to realize a detection-by-feature-matching scheme. In the inference phase, ``zero-cost'' printed texts are synthesized given a new target language. By utilizing the learned meta-mapping and the matching network, our ``MENTOR'' can freely identify the text regions of the new language. Experiments show our model can achieve comparable results with supervised methods for seen languages and outperform other methods in detecting unseen languages.
|
| |
| 15:30-17:00, Paper MoBIP-13.9 | Add to My Program |
| Towards a Robust Adversarial Patch Attack against Unmanned Aerial Vehicles Object Detection |
|
| Shrestha, Samridha | Technology Innovation Institute |
| Pathak, Saurabh | Technology Innovation Institute |
| Viegas, Eduardo | Pontif�cia Universidade Catolica Do Paran� (PUCPR), Brazil |
Keywords: Computer Vision for Automation, Deep Learning Methods
Abstract: Object detection techniques for autonomous Unmanned Aerial Vehicles (UAV) are built upon Deep Neural Networks (DNN), which are known to be vulnerable to adversarial patch perturbation attacks that lead to object detection evasion. Yet, current adversarial patch generation schemes are not designed for UAV imagery settings. This paper proposes a new robust adversarial patch generation attack against object detection with UAVs. We build adversarial patches considering UAV-specific settings such as the UAV camera perspective, viewing angle, distance, and brightness changes. As a result, built patches can also degrade the accuracy of object detector models implemented with different initializations and architectures. Experiments conducted on the VisDrone dataset have shown the proposal�s feasibility, achieving an attack success rate of up to 80% in a white-box setting. In addition, we also transfer the patch against DNN models with different initializations and different architectures, reaching attack success rates of up to 75% and 78%, respectively, in a gray-box setting.
|
| |
| 15:30-17:00, Paper MoBIP-13.10 | Add to My Program |
| Fast Point to Mesh Distance by Domain Voxelization |
|
| Gutow, Geordan | Carnegie Mellon University |
| Choset, Howie | Carnegie Mellon University |
Keywords: Computational Geometry, RGB-D Perception, Computer Vision for Automation
Abstract: Computing the distance from a point to a triangle mesh is a key computational step in robotics pipelines such as registration and collision detection, with applications to path planning, SLAM, and RGB-D vision. Numerous techniques to accelerate this computation have been developed, many of which use a cheap pre-processing step to construct a hierarchical decomposition of the mesh. If the mesh is fixed and known ahead of time, there is an opportunity to conduct more expensive pre-computations to accelerate the subsequent distance queries. This work presents a voxelization approach, implemented on both CPU and GPU, to compute point to mesh distance that constructs for each voxel a near-minimal set of triangles that is guaranteed to include every triangle that is closest to at least one point in the voxel. Theoretical and numerical comparisons with six alternative distance algorithms demonstrate the speed advantages of the proposed method.
|
| |
| 15:30-17:00, Paper MoBIP-13.11 | Add to My Program |
| AirLine: Efficient Learnable Line Detection with Local Edge Voting |
|
| Lin, Xiao | Georgia Institute of Technology |
| Wang, Chen | State University of New York at Buffalo |
Keywords: Computer Vision for Automation, SLAM
Abstract: Line detection is widely used in many robotic tasks such as scene recognition, 3D reconstruction, and simultaneous localization and mapping (SLAM). Compared to points, lines can provide both low-level and high-level geometrical information for downstream tasks. In this paper, we propose a novel learnable edge-based line detection algorithm, AirLine, which can be applied to various tasks. In contrast to existing learnable endpoint-based methods, which are sensitive to the geometrical condition of environments, AirLine can extract line segments directly from edges, resulting in a better generalization ability for unseen environments. To balance efficiency and accuracy, we introduce a region-grow algorithm and a local edge voting scheme for line parameterization. To the best of our knowledge, AirLine is one of the first learnable edge-based line detection methods. Our extensive experiments have shown that it retains state-of-the-art-level precision, yet with a 3-80x runtime acceleration compared to other learning-based methods, which is critical for low-power robots.
|
| |
| 15:30-17:00, Paper MoBIP-13.12 | Add to My Program |
| 3D Skeletonization of Complex Grapevines for Robotic Pruning |
|
| Schneider, Franz | Carnegie Mellon University |
| Jayanth, Sushanth | Carnegie Mellon University |
| Silwal, Abhisesh | Carnegie Mellon University |
| Kantor, George | Carnegie Mellon University |
Keywords: RGB-D Perception, Robotics and Automation in Agriculture and Forestry, Computer Vision for Automation
Abstract: Robotic pruning of dormant grapevines is an area of active research in order to promote vine balance and grape quality, but so far robotic efforts have largely focused on planar, simplified vines not representative of commercial vineyards. This paper aims to advance the robotic perception capabilities necessary for pruning in denser and more complex vine structures by extending plant skeletonization techniques. The proposed pipeline generates skeletal grapevine models that have lower reprojection error and higher connectivity than baseline algorithms. We also show how 3D and skeletal information enables prediction accuracy of pruning weight for dense vines surpassing prior work, where pruning weight is an important vine metric influencing pruning site selection.
|
| |
| 15:30-17:00, Paper MoBIP-13.13 | Add to My Program |
| AdaptSeqVPR: An Adaptive Sequence-Based Visual Place Recognition Pipeline |
|
| Li, Heshan | Nanyang Technological University |
| Peng, Guohao | Nanyang Technological University |
| Zhang, Jun | Nanyang Technological University |
| Vaikundam, Sriram | Continental Automotive Singapore Pte Ltd |
| Wang, Danwei | Nanyang Technological University |
Keywords: Computer Vision for Automation
Abstract: Visual Place Recognition (VPR) is essential for autonomous robots and unmanned vehicles, as an accurate identification of visited sites can trigger a closed loop to optimize the built map. The most prevalent methods tackle VPR as a single-frame retrieval task, which uses a CNN-based encoder to describe and compare each individual frame. These methods, however, overlook the temporal information between frames. Other methods improve this by searching the database with consecutive frames, which can greatly reduce false positives. Nevertheless, current sequence-based methods typically assume the image frames to be captured at a constant speed, which is not always the case in practice. Therefore, we propose an adaptive sequence search strategy (AdaptSeq), which can dynamically alter the step size of adjacent frames in the retrieved sequence trajectory. Besides, to address invalid retrieval of input frames that have no true correspondence in the database, we propose a CNN-based discriminator named DDsNet. It can determine whether the top retrieved candidates are true positives based on the learned statistics rather than an artificial threshold. Overall, we construct a novel sequence-based VPR pipeline named AdaptSeqVPR. It utilizes a CNN-based encoder for frame descriptions, and encompasses AdaptSeq and DDsNet for sequence matching. The experimental results indicate that our AdaptSeqVPR exhibits superior performance compared to the baseline SeqSLAM and SeqVLAD. Notably, our method can robustly handle the sequence-based VPR for vehicles traveling at non-uniform speeds in changing environments.
|
| |
| 15:30-17:00, Paper MoBIP-13.14 | Add to My Program |
| Towards Automated Void Detection for Search and Rescue with 3D Perception |
|
| Bal, Ananya | Carnegie Mellon University |
| Gupta, Ashutosh | BITS Pilani KK Birla Goa Campus |
| Goyal, Pranav | Birla Institute of Technology & Science - Pilan |
| Merrick, David | Florida State University |
| Murphy, Robin | Texas A&M |
| Choset, Howie | Carnegie Mellon University |
Keywords: Search and Rescue Robots, Aerial Systems: Perception and Autonomy, Computer Vision for Automation
Abstract: In a structural collapse, debris piles up in a chaotic and unstable manner, creating pockets and void spaces that are difficult to see or access. Often, these regions have the highest chances of concealing survivors and identifying such regions can increase the success of a search and rescue (SAR) operation while ensuring the safety of both survivors and rescue teams. In this paper, we present an approach for ex post facto void detection in rubble piles by using registered 3D point clouds reconstructed from aerial images captured at multiple times on the scene. We perform a temporal layering of these point clouds to capture the dynamic surface of the rubble pile from multiple days of the SAR operation and analyze this 3D structure to detect candidate regions corresponding to void spaces. The layering is achieved by a parallel 3D point cloud reconstruction of the scene using the COLMAP Structure from Motion pipeline. The void detection is achieved by applying multiple point filtering criteria in thin segments of the 3D point clouds of the rubble. We test our approach on aerial images collected from the Surfside Structural Collapse at Miami in June 2021. Our method achieves an improvement in registration compared to the use of standard point cloud registration methods on individual 3D reconstructions. Through our method, we see translation errors reduce by 82%. Additionally, our method detects 9 out of 10 void spaces that were observed by experts in the rubble.
|
| |
| MoBIP-14 Regular session, Hall E |
Add to My Program |
| Clone of 'Localization II' |
|
| |
| |
| 15:30-17:00, Paper MoBIP-14.1 | Add to My Program |
| (LC)2: LiDAR-Camera Loop Constraints for Cross-Modal Place Recognition |
|
| Lee, Alex | Hyundai Motor Company |
| Song, Seungwon | Hyundai Motor Company |
| Lim, Hyungtae | Korea Advanced Institute of Science and Technology |
| Lee, Wooju | KAIST |
| Myung, Hyun | KAIST (Korea Advanced Institute of Science and Technology) |
Keywords: Localization, Sensor Fusion, Deep Learning for Visual Perception
Abstract: Localization has been a challenging task for autonomous navigation. A loop detection algorithm must overcome environmental changes for the place recognition and re-localization of robots. Therefore, deep learning has been extensively studied for the consistent transformation of measurements into localization descriptors. Street view images are easily accessible; however, images are vulnerable to appearance changes. LiDAR can robustly provide precise structural information. However, constructing a point cloud database is expensive, and point clouds exist only in limited places. Different from previous works that train networks to produce shared embedding directly between the 2D image and 3D point cloud, we transform both data into 2.5D depth images for matching. In this work, we propose a novel cross-matching method, called (LC)2, for achieving LiDAR localization without a prior point cloud map. To this end, LiDAR measurements are expressed in the form of range images before matching them to reduce the modality discrepancy. Subsequently, the network is trained to extract localization descriptors from disparity and range images. Next, the best matches are employed as a loop factor in a pose graph. Using public datasets that include multiple sessions in significantly different lighting conditions, we demonstrated that LiDAR-based navigation systems could be optimized from image databases and vice versa.
|
| |
| 15:30-17:00, Paper MoBIP-14.2 | Add to My Program |
| Visual Localization Based on Multiple Maps |
|
| Lin, Yukai | ETH Zurich |
| Liu, Liu | Huawei |
| Liang, Xiao | The University of Tokyo |
| Li, Jiangwei | Huawei Cloud Computing Technologies Co., Ltd |
Keywords: Localization, Vision-Based Navigation, SLAM
Abstract: This paper proposes a multi-map based visual localization method for image sequences. Given multiple single-map based localization results, we combine them with SLAM to estimate robust and accurate camera poses under challenging conditions. Our method comprises three modules connected in a sequence. First, we reconstruct multiple reference maps using the Structure from Motion technique, one map for each reference sequence. A single-image-based localization pipeline is performed to estimate 6-DoF camera poses for each query image, one for each map. Second, a consensus set maximization module is proposed to select the best camera poses from multi-map poses, estimating one 6-DoF camera pose for each query image. Finally, a robust pose refinement module is proposed to optimize 6-DoF camera poses of query images, combining map-based localization and local SLAM information. Experiments show that the proposed pipeline achieves state-of-the-art performance on challenging map-based localization benchmarks. Demonstrating the broad applicability of our method, we obtained the first place in the challenge of Map-Based Localization for Autonomous Driving at ECCV2022.
|
| |
| 15:30-17:00, Paper MoBIP-14.3 | Add to My Program |
| An Interacting Multiple Model Approach Based on Maximum Correntropy Student's T Filter |
|
| Candan, Fethi | The University of Sheffield |
| Beke, Aykut | Aselsan |
| Mihaylova, Lyudmila | University of Sheffield |
Keywords: Localization, Visual Tracking, Aerial Systems: Applications
Abstract: This paper presents a novel Interacting Multiple Model (IMM)-based maximum correntropy Student's T filter (MCStF). The MCStF is able to work with non-Gaussian measurement noises, and it is shown to outperform the IMM algorithm based on Kalman Filters (KFs) both in a simulation environment and on a real-time system. The Crazyflie 2.0 nano Unmanned Air Vehicle (UAV) model is used in the simulation validation, and results from 3000 independent Monte Carlo runs are shown. After getting the simulation results under monotonously changed non-Gaussian distribution, their performance results have been compared to each other. The same scenario has been applied in the real-time system using Crazyflie 2.0. Next, results from real-time tests are presented in which the position of Crazyflie 2.0 is estimated online.
|
| |
| 15:30-17:00, Paper MoBIP-14.4 | Add to My Program |
| Deep Robust Multi-Robot Re-Localisation in Natural Environments |
|
| Ramezani, Milad | CSIRO |
| Griffiths, Ethan | Queensland University of Technology |
| Haghighat, Maryam | Queensland University of Technology |
| Pitt, Alex | CSIRO |
| Moghadam, Peyman | CSIRO |
Keywords: Localization, Deep Learning Methods, Recognition
Abstract: The success of re-localisation has crucial implications for the practical deployment of robots operating within a prior map or relative to one another in real-world scenarios. Using single-modality, place recognition and localisation can be compromised in challenging environments such as forests. To address this, we propose a strategy to prevent lidar-based re-localisation failure using lidar-image cross-modality. Our solution relies on self-supervised 2D-3D feature matching to predict alignment and misalignment. Leveraging a deep network for lidar feature extraction and relative pose estimation between point clouds, we train a model to evaluate the estimated transformation. A model predicting the presence of misalignment is learned by analysing image-lidar similarity in the embedding space and the geometric constraints available within the region seen in both modalities in Euclidean space. Experimental results using real datasets (offline and online modes) demonstrate the effectiveness of the proposed pipeline for robust re-localisation in unstructured, natural environments.
|
| |
| 15:30-17:00, Paper MoBIP-14.5 | Add to My Program |
| FVLoc-NeRF: Fast Vision-Only Localization within Neural Radiation Field |
|
| Guo, Wenzhi | Nanjing University |
| Haiyang, Bai | Nanjing University |
| Mou, Yuanqu | Nanjing University |
| Liu, Jia | Nanjing University |
| Chen, Lijun | Nanjing University |
Keywords: Localization, Deep Learning Methods, SLAM
Abstract: In recent years, Neural Radiation Fields (NeRF) have shown tremendous potential in encoding highly-detailed 3D geometry and environmental appearance, thus making it a promising alternative to traditional explicit maps for robot localization. However, current NeRF localization methods suffer from significant computational overheads, primarily resulting from the large number of iterations or particle samples required, as well as the additional computational demands associated with the estimation of the initial pose through multimodal sensors. To overcome these challenges, we propose a novel and time-efficient NeRF localization pipeline, named FVLoc-NeRF. This pipeline solely employs RGB monocular images as input and leverages a retrieval method to obtain the initial pose. Subsequently, the pose update is derived using the Perspective-n-Point (PnP) algorithm, thereby considerably reducing the number of iterations and accelerating the localization process. Our extensive experimental results clearly demonstrate that FVLoc-NeRF is much faster than the state-of-the-art method.
|
| |
| 15:30-17:00, Paper MoBIP-14.6 | Add to My Program |
| RADA: Robust Adversarial Data Augmentation for Camera Localization in Challenging Conditions |
|
| Wang, Jialu | Oxford |
| Saputra, Muhamad Risqi U. | Monash University, Indonesia |
| Lu, Chris Xiaoxuan | University of Edinburgh |
| Trigoni, Niki | University of Oxford |
| Markham, Andrew | Oxford University |
Keywords: Localization, Computer Vision for Transportation
Abstract: Camera localization is a fundamental problem for many applications in computer vision, robotics, and autonomy. Despite recent deep learning-based approaches, the lack of robustness in challenging conditions persists due to changes in appearance caused by texture-less planes, repeating structures, reflective surfaces, motion blur, and illumination changes. Data augmentation is an attractive solution, but standard image perturbation methods fail to improve localization robustness. To address this, we propose RADA, which concentrates on perturbing the most vulnerable pixels to generate relatively less image perturbations that perplex the network. Our method outperforms previous augmentation techniques, achieving up to twice the accuracy of state-of-the-art models even under 'unseen' challenging weather conditions.
|
| |
| 15:30-17:00, Paper MoBIP-14.7 | Add to My Program |
| MagHT: A Magnetic Hough Transform for Fast Indoor Place Recognition |
|
| Abdul Raouf, Iad | CEA List |
| Gay-Bellile, Vincent | CEA LIST |
| Bourgeois, Steve | CEA LIST |
| Joly, Cyril | Mines ParisTech, PSL Research University |
| Paljic, Alexis | Mines ParisTech |
Keywords: Localization, Recognition, SLAM
Abstract: This article proposes a novel indoor magnetic field-based place recognition algorithm that is accurate and fast to compute. For that, we modified the generalized "Hough Transform" to process magnetic data (MagHT). It takes as input a sequence of magnetic measures whose relative positions are recovered by an odometry system and recognizes the places in the magnetic map where they were acquired. It also returns the global transformation from the coordinate frame of the input magnetic data to the magnetic map reference frame. Experimental results on several real datasets in large indoor environments demonstrate that the obtained localization error, recall, and precision are similar to or are better than state-of-the-art methods while improving the runtime by several orders of magnitude. Moreover, unlike magnetic sequence matching-based solutions such as DTW, our approach is independent of the path taken during the magnetic map creation.
|
| |
| 15:30-17:00, Paper MoBIP-14.8 | Add to My Program |
| What to Learn: Features, Image Transformations, or Both? |
|
| Chen, Yuxuan | University of Toronto |
| Xu, Binbin | University of Toronto |
| D�mbgen, Frederike | University of Toronto |
| Barfoot, Timothy | University of Toronto |
Keywords: Localization, Deep Learning for Visual Perception, Vision-Based Navigation
Abstract: Long-term visual localization is an essential problem in robotics and computer vision, but remains challenging due to the environmental appearance changes caused by lighting and seasons. While many existing works have attempted to solve it by directly learning invariant sparse keypoints and descriptors to match scenes, these approaches still struggle with adverse appearance changes. Recent developments in image transformations such as neural style transfer have emerged as an alternative to address such appearance gaps. In this work, we propose to combine an image transformation network and a feature-learning network to improve long-term localization performance. Given night-to-day image pairs, the image transformation network transforms the night images into day-like conditions prior to feature matching; the feature network learns to detect keypoint locations with their associated descriptor values, which can be passed to a classical pose estimator to compute the relative poses. We conducted various experiments to examine the effectiveness of combining style transfer and feature learning and its training strategy, showing that such a combination greatly improves long-term localization performance.
|
| |
| 15:30-17:00, Paper MoBIP-14.9 | Add to My Program |
| Global Localization: Utilizing Relative Spatio-Temporal Geometric Constraints from Adjacent and Distant Cameras |
|
| Altillawi, Mohammad | Huawei, Autonomous University of Barcelona, |
| Pataki, Zador | ETH Zurich |
| Li, Shile | Algolux Germany |
| Liu, Ziyuan | Huawei Group |
Keywords: Localization, Vision-Based Navigation, Virtual Reality and Interfaces
Abstract: Re-localizing a camera from a single image in a previously mapped area is vital for many computer vision applications in robotics and augmented/virtual reality. In this work, we address the problem of estimating the 6 DoF camera pose relative to a global frame from a single image. We propose to leverage a novel network of relative spatial and temporal geometric constraints to guide the training of a Deep Network for Localization. We employ simultaneously spatial and temporal relative pose constraints that are obtained not only from adjacent camera frames but also from camera frames that are distant in the spatio-temporal space of the scene. We show that our method, through these constraints, is capable of learning to localize when little or very sparse ground-truth 3D coordinates are available. In our experiments, this is less than 1% of available ground-truth data. We evaluate our method on 3 common visual localization datasets and show that it outperforms other direct pose estimation methods.
|
| |
| 15:30-17:00, Paper MoBIP-14.10 | Add to My Program |
| Uncertainty-Aware Lidar Place Recognition in Novel Environments |
|
| Mason, Keita | CSIRO |
| Knights, Joshua Barton | Queensland University of Technology |
| Ramezani, Milad | CSIRO |
| Moghadam, Peyman | CSIRO |
| Miller, Dimity | Queensland University of Technology |
Keywords: Localization, Deep Learning for Visual Perception, Recognition
Abstract: State-of-the-art lidar place recognition models exhibit unreliable performance when tested on environments different from their training dataset, which limits their use in complex and evolving environments. To address this issue, we investigate the task of uncertainty-aware lidar place recognition, where each predicted place must have an associated uncertainty that can be used to identify and reject incorrect predictions. We introduce a novel evaluation protocol and present the first comprehensive benchmark for this task, testing across five uncertainty estimation techniques and three large-scale datasets. Our results show that an Ensembles approach is the highest performing technique, consistently improving the performance of lidar place recognition and uncertainty estimation in novel environments, though it incurs a computational cost. Code is publicly available at https://github.com/csiro-robotics/Uncertainty-LPR.
|
| |
| 15:30-17:00, Paper MoBIP-14.11 | Add to My Program |
| Hot-NetVLAD: Learning Discriminatory Key Points for Visual Place Recognition |
|
| Li, Zhikai | National University of Singapore |
| Lee, Christina Dao Wen | National University of Singapore |
| Tung, Beatrix | Singapore-MIT Alliance for Research and Technology |
| Huang, Zefan | National University of Singapore |
| Rus, Daniela | MIT |
| Ang Jr, Marcelo H | National University of Singapore |
Keywords: Localization, Vision-Based Navigation, Intelligent Transportation Systems
Abstract: Hot-NetVLAD implements a hot-spot detector on a learned local key-patch descriptor algorithm for Visual Place Recognition (VPR), thereby greatly cutting down the size of features extracted. The hot-spots pinpoint which regions are crucial for comparison when performing VPR. As hot-spots land on only a small portion of the feature space, the number of local descriptors extracted is greatly reduced. A novel method to extract ground truths of hot-spots in the context of VPR is proposed so that the hot-spot detector in Hot-NetVLAD can be trained for VPR purposes. Hot-NetVLAD is evaluated on the Pittsburgh250k and Tokyo24/7 datasets. While results show that Hot-NetVLAD trades some accuracy loss for storage efficiency, the recall remains competitive when compared to state-of- the-art methods. Furthermore, identified hot-spots bring new insights to key regions required for VPR, as they tend to fall on distinguishable static objects in the scene. This can potentially be applied to increase the robustness of mobile robot localization by increasing resilience to dynamic environments, whilst still being able to perform static obstacle matching effectively.
|
| |
| 15:30-17:00, Paper MoBIP-14.12 | Add to My Program |
| Data-Driven Based Cascading Orientation and Translation Estimation for Inertial Navigation |
|
| Deng, Xiangyu | OPPO |
| Wang, Shenyue | OPPO |
| Shan, ChunXiang | OPPO |
| Lu, Jinjie | OPPO |
| Jin, Ke | OPPO |
| Li, Jijunnan | OPPO Research Institute |
| Guo, Yandong | OPPO Research Institute |
Keywords: Localization, AI-Based Methods
Abstract: Recently, data-driven approaches have brought both opportunities and challenges for Inertial Navigation Systems. In this paper, we propose a novel data-driven method which is composed of cascading orientation and translation estimation with IMU-only measurements. For robust orientation estimation, we combine a CNN-based neural network with an EKF to eliminate orientation errors caused by sensor noises. We additionally propose a hybrid CNN-Transformer-based neural network which exploits both spatial and long-term temporal information to regress accurate translations. Specifically, we conduct detailed evaluations on datasets acquired by iPhone and Android devices. The result demonstrates that our method outperforms state-of-the-art methods in both orientation and translation errors.
|
| |
| 15:30-17:00, Paper MoBIP-14.13 | Add to My Program |
| FE-Fusion-VPR: Attention-Based Multi-Scale Network Architecture for Visual Place Recognition by Fusing Frames and Events |
|
| Hou, Kuanxu | Northeastern University |
| Kong, Delei | Northeastern University (China) |
| Jiang, Junjie | Northeastern University |
| Zhuang, Hao | Northeastern University |
| Huang, Xinjie | Northeastern University, China |
| Fang, Zheng | Northeastern University |
Keywords: Localization, Recognition, Deep Learning Methods
Abstract: Traditional visual place recognition (VPR), usually using standard cameras, is easy to fail due to glare or high-speed motion. By contrast, event cameras have the advantages of low latency, high temporal resolution, and high dynamic range, which can deal with the above issues. Nevertheless, event cameras are prone to failure in motionless scenes, while standard cameras can still provide appearance information in this case. Thus, exploiting the complementarity of standard cameras and event cameras can effectively improve the performance of VPR algorithms. In the paper, we propose FE-Fusion-VPR, an attention-based multi-scale network architecture for VPR by fusing frames and events. First, the intensity frame and event volume are fed into the two-stream feature extraction network for shallow feature fusion. Next, the three-scale features are obtained through the multi-scale fusion network and aggregated into three sub-descriptors using the VLAD layer. Finally, the weight of each sub-descriptor is learned through the descriptor re-weighting network to obtain the final refined descriptor. Experimental results show that our FE-Fusion-VPR outperforms existing frame-based, event-based and fusion-based VPR methods in most cases on Brisbane-Event-VPR and DDD20 datasets. In a word, compared to the previous works, our FE-Fusion-VPR achieves new state-of-the-art (SOTA) VPR performance in Brisbane-Event-VPR and DDD20 datasets by fusing frames and events.
|
| |
| MoBIP-15 Regular session, Hall E |
Add to My Program |
| Clone of 'Visual SLAM' |
|
| |
| |
| 15:30-17:00, Paper MoBIP-15.1 | Add to My Program |
| Self-Supervised Domain Calibration and Uncertainty Estimation for Place Recognition |
|
| Lajoie, Pierre-Yves | �cole Polytechnique De Montr�al |
| Beltrame, Giovanni | Ecole Polytechnique De Montreal |
Keywords: SLAM, Deep Learning for Visual Perception
Abstract: Visual place recognition techniques based on deep learning, which have imposed themselves as the state-of-the-art in recent years, do not generalize well to environments visually different from the training set. Thus, to achieve top performance, it is sometimes necessary to fine-tune the networks to the target environment. To this end, we propose a self-supervised domain calibration procedure based on robust pose graph optimization from Simultaneous Localization and Mapping (SLAM) as the supervision signal without requiring GPS or manual labeling. Moreover, we leverage the procedure to improve uncertainty estimation for place recognition matches which is important in safety critical applications. We show that our approach can improve the performance of a state-of-the-art technique on a target environment dissimilar from its training set and that we can obtain uncertainty estimates. We believe that this approach will help practitioners to deploy robust place recognition solutions in real-world applications. Our code is available publicly: https://github.com/MISTLab/vpr-calibration-and-uncertainty
|
| |
| 15:30-17:00, Paper MoBIP-15.2 | Add to My Program |
| ISimLoc: Visual Global Localization for Previously Unseen Environments with Simulated Images (I) |
|
| Yin, Peng | City University of Hong Kong |
| Cisneros, Ivan | Carnegie Mellon University |
| Zhao, Shiqi | University of California San Diego |
| Zhang, Ji | Carnegie Mellon University |
| Choset, Howie | CMU |
| Scherer, Sebastian | Carnegie Mellon University |
Keywords: SLAM, Localization, Visual-Based Navigation, Visual Global Localization
Abstract: The camera is an attractive device for use in beyond visual line of sight drone operation since cameras are low in size, weight, power, and cost. However, state-of-the-art visual localization algorithms have trouble matching visual data that have significantly different appearances due to changes in illumination or viewpoint. This paper presents iSimLoc, a learning-based global re-localization approach that is robust to appearance and viewpoint differences. The features learned by iSimLoc's place recognition network can be utilized to match query images to reference images of a different stylistic domain and viewpoint. Additionally, our hierarchical global re-localization module searches in a coarse-to-fine manner, allowing iSimLoc to perform fast and accurate pose estimation. We evaluate our method on a dataset with appearance variations and a dataset that focuses on demonstrating large-scale matching over a long flight over complex terrain. iSimLoc achieves 88.7% and 83.8% successful retrieval rates on our two datasets, with 1.5s inference time, compared to 45.8% and 39.7% using the next best method. These results demonstrate robust localization in a range of environments a
|
| |
| 15:30-17:00, Paper MoBIP-15.3 | Add to My Program |
| Converting Depth Images and Point Clouds for Feature-Based Pose Estimation |
|
| L�sch, Robert | TU Bergakademie Freiberg |
| Sastuba, Mark | Federal Railway Authority Germany |
| Toth, Jonas | TU Bergakademie Freiberg |
| Jung, Bernhard | TU Bergakademie Freiberg |
Keywords: Recognition, RGB-D Perception
Abstract: In recent years, depth sensors have become more and more affordable and have found their way into a growing amount of robotic systems. However, mono- or multi-modal sensor registration, often a necessary step for further processing, faces many challenges on raw depth images or point clouds. This paper presents a method of converting depth data into images capable of visualizing spatial details that are basically hidden in traditional depth images. After noise removal, a neighborhood of points forms two normal vectors whose difference is encoded into this new conversion. Compared to Bearing Angle images, our method yields brighter, higher-contrast images with more visible contours and more details. We tested feature-based pose estimation of both conversions in a visual odometry task and RGB-D SLAM. For all tested features, AKAZE, ORB, SIFT, and SURF, our new Flexion images yield better results than Bearing Angle images and show great potential to bridge the gap between depth data and classical computer vision. Source code is available here: https://rlsch.github.io/depth-flexion-conversion.
|
| |
| 15:30-17:00, Paper MoBIP-15.4 | Add to My Program |
| AirVO: An Illumination-Robust Point-Line Visual Odometry |
|
| Xu, Kuan | NTU |
| Hao, Yuefan | Geekplus Corp |
| Yuan, Shenghai | Nanyang Technological University |
| Wang, Chen | State University of New York at Buffalo |
| Xie, Lihua | NanyangTechnological University |
Keywords: SLAM, Localization
Abstract: This paper proposes an illumination-robust visual odometry (VO) system that incorporates both accelerated learning-based corner point algorithms and an extended line feature algorithm. To be robust to dynamic illumination, the proposed system employs convolutional neural networks (CNN) to detect and match reliable and informative corner points. Then point feature matching results and the distribution of point and line features are utilized to match and triangulate lines. By accelerating CNN parts and optimizing the pipeline, the proposed system is able to run in real-time on low-power embedded platforms. The proposed VO was evaluated on several datasets with varying illumination conditions, and the results show that it outperforms other state-of-the-art VO and VIO systems in terms of accuracy and robustness. The open-source nature of the proposed system allows for easy implementation and customization by the research community, enabling further development and improvement of VO for various applications.
|
| |
| 15:30-17:00, Paper MoBIP-15.5 | Add to My Program |
| NeRF-SLAM: Real-Time Dense Monocular SLAM with Neural Radiance Fields |
|
| Rosinol, Antoni | MIT |
| Carlone, Luca | Massachusetts Institute of Technology |
| Leonard, John | MIT |
Keywords: Mapping, Localization, SLAM
Abstract: We propose a novel geometric and photometric 3D mapping pipeline for accurate and real-time scene reconstruction from casually taken monocular images. To achieve this, we leverage recent advances in dense monocular SLAM and real-time hierarchical volumetric neural radiance fields. Our insight is that dense monocular SLAM provides the right information to fit a neural radiance field of the scene in real-time, by providing accurate pose estimates and depth-maps with associated uncertainty. Our proposed pipeline achieves better geometric and photometric accuracy than competing approaches (up to 178% better PSNR and 75% better L1 depth), while working in real-time and using only monocular images.
|
| |
| 15:30-17:00, Paper MoBIP-15.6 | Add to My Program |
| Scale Jump-Aware Pose Graph Relaxation for Monocular SLAM with Re-Initializations |
|
| Yuan, Runze | Shanghaitech |
| Cheng, Ran | Midea Robozone |
| Lige, Liu | Midea Group |
| Sun, Tao | Massachusetts Institute of Technology |
| Kneip, Laurent | ShanghaiTech University |
Keywords: SLAM, Localization
Abstract: Pose graph relaxation has become an indispensable addition to SLAM enabling efficient global registration of sensor reference frames under the objective of satisfying pair-wise relative transformation constraints. The latter may be given by incremental motion estimation or global place recognition. While the latter case enables loop closures and drift compensation, care has to be taken in the monocular case in which local estimates of structure and displacements can differ from reality not just in terms of noise, but also in terms of a scale factor. Owing to the accumulation of scale propagation errors, this scale factor is drifting over time, hence scale-drift aware pose graph relaxation has been introduced. We extend this idea to cases in which the relative scale between subsequent sensor frames is unknown, a situation that can easily occur if monocular SLAM enters re-initialization and no reliable overlap between successive local maps can be identified. The approach is realized by a hybrid pose graph formulation that combines the regular similarity consistency terms with novel, scale-blind constraints. We apply the technique to the practically relevant case of small indoor service robots capable of effectuating purely rotational displacements, a condition that can easily cause tracking failures. We demonstrate that globally consistent trajectories can be recovered even if multiple re-initializations occur along the loop, and present an in-depth study of success and failure cases.
|
| |
| 15:30-17:00, Paper MoBIP-15.7 | Add to My Program |
| Optimizing the Extended Fourier Mellin Transformation Algorithm |
|
| Jiang, Wenqing | ShanghaiTech University |
| Li, Chengqian | ShanghaiTech University |
| Cao, Jinyue | Shanghaitech University |
| Schwertfeger, S�ren | ShanghaiTech University |
Keywords: SLAM, Computer Vision for Automation
Abstract: With the increasing application of robots, stable and efficient Visual Odometry (VO) algorithms are becoming more and more important. Based on the Fourier Mellin Transformation (FMT) algorithm, the extended Fourier Mellin Transformation (eFMT) is an image registration approach that can be applied to downward-looking cameras, for example on aerial and underwater vehicles. eFMT extends FMT to multi-depth scenes and thus more application scenarios. It is a visual odometry method which estimates the pose transformation between three overlapping images. On this basis, we develop an optimized eFMT algorithm that improves certain aspects of the method and combines it with back-end optimization for the small loop of three consecutive frames. For this we investigate the extraction of uncertainty information from the eFMT registration, the related objective function and the graph-based optimization. Finally, we design a series of experiments to investigate the properties of this approach and compare it with other VO and SLAM (Simultaneous Localization and Mapping) algorithms. The results show the superior accuracy and speed of our o-eFMT approach, which is published as open source.
|
| |
| 15:30-17:00, Paper MoBIP-15.8 | Add to My Program |
| Marker-Based Visual SLAM Leveraging Hierarchical Representations |
|
| Tourani, Ali | University of Luxembourg |
| Bavle, Hriday | University of Luxembourg |
| Sanchez-Lopez, Jose Luis | Interdisciplinary Center for Security, Reliability and Trust (Sn |
| Munoz Salinas, Rafael | University of Cordoba, Spain |
| Voos, Holger | University of Luxembourg |
Keywords: SLAM, Visual-Inertial SLAM, Mapping
Abstract: Fiducial markers can encode rich information about the environment and aid Visual SLAM (VSLAM) approaches in reconstructing maps with practical semantic information. Current marker-based VSLAM approaches mainly utilize markers for improving feature detections in low-feature environments and/or incorporating loop closure constraints, generating only low-level geometric maps of the environment prone to inaccuracies in complex environments. To bridge this gap, this paper presents a VSLAM approach utilizing a monocular camera along with fiducial markers to generate hierarchical representations of the environment while improving the camera pose estimate. The proposed approach detects semantic entities from the surroundings, including walls, corridors, and rooms encoded within markers, and appropriately adds topological constraints among them. Experimental results on a real-world dataset collected with a robot demonstrate that the proposed approach outperforms a marker-based VSLAM baseline in terms of accuracy, given the addition of new constraints while creating enhanced map representations. Furthermore, it shows satisfactory results when comparing the reconstructed map quality to the one rebuilt using a LiDAR SLAM approach.
|
| |
| 15:30-17:00, Paper MoBIP-15.9 | Add to My Program |
| RVWO: A Robust Visual-Wheel SLAM System for Mobile Robots in Dynamic Environments |
|
| Mahmoud, Jaafar | ITMO University |
| Penkovskiy, Andrey | ITMO University |
| Ha, The Long Vuong | ITMO University |
| Burkov, Aleksei | Sber Robotics Laboratory |
| Kolyubin, Sergey | ITMO University |
Keywords: SLAM, Sensor Fusion, Wheeled Robots
Abstract: This paper presents RVWO, a system designed to provide robust localization and mapping for wheeled mobile robots in challenging scenarios. The proposed approach leverages a probabilistic framework that incorporates semantic prior information about landmarks and visual re-projection error to create a landmark reliability model, which acts as an adaptive kernel for the visual residuals in optimization. Additionally, we fuse visual residuals with wheel odometry measurements, taking advantage of the planar motion assumption. The RVWO system is designed to be robust against wrong data association due to moving objects, poor visual texture, bad illumination, and wheel slippage. Evaluation results demonstrate that the proposed system shows competitive results in dynamic environments and outperforms existing approaches on both public benchmarks and our custom hardware setup. We also provide the code as an open-source contribution to the robotics community
|
| |
| 15:30-17:00, Paper MoBIP-15.10 | Add to My Program |
| Event Camera-Based Visual Odometry for Dynamic Motion Tracking of a Legged Robot Using Adaptive Time Surface |
|
| Zhu, Shifan | University of Massachusetts Amherst |
| Tang, Zhipeng | University of Massachusetts Amherst |
| Yang, Michael | University of Massachusetts Amherst |
| Learned-Miller, Erik | University of Massachusetts, Amherst |
| Kim, Donghyun | University of Massachusetts Amherst |
Keywords: SLAM, Legged Robots, Localization
Abstract: Our paper proposes a direct sparse visual odometry method that combines event and RGB-D data to estimate the pose of agile-legged robots during dynamic locomotion and acrobatic behaviors. Event cameras offer high temporal resolution and dynamic range, which can eliminate the issue of blurred RGB images during fast movements. This unique strength holds a potential for accurate pose estimation of agile-legged robots, which has been a challenging problem to tackle. Our framework leverages the benefits of both RGB- D and event cameras to achieve robust and accurate pose estimation, even during dynamic maneuvers such as jumping and landing of a quadruped robot, Mini-Cheetah. Our major contributions are threefold: Firstly, we introduce an adaptive time surface (ATS) method that addresses the whiteout and blackout issue in common time surfaces by formulating pixel-wise decay rates based on scene complexity and motion speed. Secondly, we develop an effective pixel selection method that directly samples from event data and applies sample filtering through ATS, enabling us to pick pixels on distinct features. Lastly, we propose a nonlinear pose optimization formula that simultaneously performs 3D-2D alignment on both RGB-based and event-based maps and images, allowing the algorithm to fully exploit the benefits of both data streams. We extensively evaluate the performance of our framework on both public datasets and our own quadruped robot dataset, demonstrating its effectiveness in accurately estimating the pose of agile robots during dynamic movements.
|
| |
| 15:30-17:00, Paper MoBIP-15.11 | Add to My Program |
| Enhancing Robustness of Line Tracking through Semi-Dense Epipolar Search in Line-Based SLAM |
|
| Seo, Dong-Uk | Korea Advanced Institute of Science and Technology |
| Lim, Hyungtae | Korea Advanced Institute of Science and Technology |
| Lee, Eungchang Mason | Korea Advanced Institute of Science and Technology |
| Lim, Hyunjun | Korea Advanced Institute of Science and Technology |
| Myung, Hyun | KAIST (Korea Advanced Institute of Science and Technology) |
Keywords: Visual-Inertial SLAM, Visual Tracking, SLAM
Abstract: Line information from urban structures can be exploited as an additional geometrical feature to achieve robust vision-based simultaneous localization and mapping (SLAM) systems in textureless scenes. Sometimes, however, conventional line tracking methods fail to track caused by image blur or occlusion. Even though these lost line features are just a subset of plenty of features, the failure in feature tracking can potentially lead to performance degradation of the SLAM system, particularly in textureless environments. To tackle this problem, we propose a robust line-tracking method for line-based monocular visual-inertial odometry. The proposed method generates a semi-dense map composed of depth and sparsity mesh using estimated 3D features. By leveraging this semi-dense map, our method performs a range-adaptive epipolar search to match the lines, allowing for robust line tracking while simultaneously reducing false positives. Furthermore, an algorithm to avoid conflicts is proposed, which occurs when the tracked lines from consecutive matching do not accord with the lines matched by our method. This algorithm discriminately maintains line features while appropriately aggregating lines spread across multiple frames. As evaluated in the EuRoC dataset and a more challenging textureless corridor scene, our proposed method shows substantial performance increases compared with other line-based visual (-inertial) approaches.
|
| |
| 15:30-17:00, Paper MoBIP-15.12 | Add to My Program |
| Stereo Visual Odometry with Deep Learning-Based Point and Line Feature Matching Using an Attention Graph Neural Network |
|
| Kannapiran, Shenbagaraj | Arizona State University |
| Bendapudi, Nalin | Ford Motor Company |
| Yu, Ming-Yuan | University of Michigan |
| Parikh, Devarth | Ford Motor Company |
| Berman, Spring | Arizona State University |
| Vora, Ankit | Ford Motor Company |
| Pandey, Gaurav | Ford Motor Company |
Keywords: SLAM, Localization
Abstract: Robust feature matching forms the backbone for most Visual Simultaneous Localization and Mapping (vSLAM), visual odometry, 3D reconstruction, and Structure from Motion (SfM) algorithms. However, recovering feature matches from texture-poor scenes is a major challenge and still remains an open area of research. In this paper, we present a Stereo Visual Odometry (SVO) technique based on point and line features which uses a novel feature-matching mechanism based on an Attention Graph Neural Network that is designed to perform well even under adverse weather conditions such as fog, haze, rain, and snow, and dynamic lighting conditions such as nighttime illumination and glare scenarios. We perform experiments on multiple real and synthetic datasets to validate our method�s ability to perform SVO under low-visibility weather and lighting conditions through robust point and line matches. The results demonstrate that our method achieves more line feature matches than state-of-the-art line-matching algorithms, which when complemented with point feature matches perform consistently well in adverse weather and dynamic lighting conditions.
|
| |
| 15:30-17:00, Paper MoBIP-15.13 | Add to My Program |
| SID-SLAM: Semi-Direct Information-Driven RGB-D SLAM |
|
| Fontan, Alejandro | Queensland University of Technology |
| Giubilato, Riccardo | German Aerospace Center (DLR) |
| Oliva, Maza, Laura | German Aeroespace Center (DLR) |
| Civera, Javier | Universidad De Zaragoza |
| Triebel, Rudolph | German Aerospace Center (DLR) |
Keywords: SLAM, Localization, Data Sets for SLAM
Abstract: This work presents SID-SLAM, a complete SLAM framework for RGB-D cameras. Our main contribution is a semi-direct approach that, for the first time, combines tightly and indistinctly photometric and feature-based image measurements. Additionally, SID-SLAM uses information metrics to reduce the state size with a minimal impact in the accuracy. Our evaluation on several public datasets shows that we achieve state-of-the-art performance regarding accuracy, robustness and computational footprint in CPU real time. In order to facilitate research on semi-direct SLAM, we record the Minimal Texture dataset, composed by RGB-D sequences that are challenging for current baselines and in which our pipeline excels.
|
| |
| MoBIP-16 Regular session, Hall E |
Add to My Program |
| Clone of 'AI-Enabled Robotics' |
|
| |
| |
| 15:30-17:00, Paper MoBIP-16.1 | Add to My Program |
| The Design, Education and Evolution of a Robotic Baby (I) |
|
| Zhu, Hanqing | Georgia Institute of Technology |
| Wilson, Sean | Georgia Institute of Technology, Georgia Tech Research Institute |
| Feron, Eric | Georgia Institute of Technology |
Keywords: Learning and Adaptive Systems, AI-Based Methods, Control Architectures and Programming, Natural Language Acquisition and Programming
Abstract: Inspired by Alan Turing�s idea of a child machine, we introduce the formal definition of a robotic baby, an integrated system with minimal world knowledge at birth, capable of learning incrementally and interactively, and adapting to the world. Within the definition, fundamental capabilities and system characteristics of the robotic baby are identified and presented as the system-level requirements. As a minimal viable prototype, the Baby architecture is proposed with a systems engineering design approach to satisfy the system-level requirements, which has been verified and validated with simulations and experiments on a robotic system. We demonstrate the capabilities of the robotic baby in natural language acquisition and semantic parsing in English and Chinese, as well as in natural language grounding, natural language reinforcement learning, natural language programming and system introspection for explainability. The education and evolution of the robotic baby are illustrated with real world robotic demonstrations. Inspired by the genetic inheritance in human beings, knowledge inheritance in robotic babies and its benefits regarding evolution are discussed.
|
| |
| 15:30-17:00, Paper MoBIP-16.2 | Add to My Program |
| Selective Presentation of AI Object Detection Results While Maintaining Human Reliance |
|
| Fukuchi, Yosuke | National Institute of Informatics |
| Yamada, Seiji | National Institute of Informatics |
Keywords: Acceptability and Trust, Intelligent Transportation Systems, AI-Based Methods
Abstract: Transparency in decision-making is an important factor for AI-driven autonomous systems to be trusted and relied on by users. Studies in the field of visual information processing typically attempt to make an AI system's behavior transparent by showing bounding boxes or heatmaps as explanations. However, it has also been found that an excessive amount of explanations sometimes causes information overload and brings negative results. This paper proposes SmartBBox, a method for reducing the number of bounding boxes to show while maintaining human reliance on an AI. It infers if each bounding box is worth showing by predicting its effect on human reliance. SmartBBox can autonomously learn to decide whether to show bounding boxes from humans' usage data. We implemented and tested SmartBBox in an autonomous driving scenario in which a human continuously decides whether to rely on an autonomous driving system while observing the dynamic results of object detection by the system. The results suggest that SmartBBox can reduce bounding boxes 64.8% on average from object recognition results while keeping human reliance at the same level as in the case where all the bounding boxes are presented.
|
| |
| 15:30-17:00, Paper MoBIP-16.3 | Add to My Program |
| Ego-Noise Reduction of a Mobile Robot Using Noise Spatial Covariance Matrix Learning and Minimum Variance Distortionless Response |
|
| Lagac�, Pierre-Olivier | Universit� De Sherbrooke |
| Ferland, Fran�ois | Universit� De Sherbrooke |
| Grondin, Francois | Universit� De Sherbrooke |
Keywords: Robot Audition
Abstract: The performance of speech and events recognition systems significantly improved recently thanks to deep learning methods. However, some of these tasks remain challenging when algorithms are deployed on robots due to the unseen mechanical noise and electrical interference generated by their actuators while training the neural networks. Ego-noise reduction as a preprocessing step therefore can help solve this issue when using pre-trained speech and event recognition algorithms on robots. In this paper, we propose a new method to reduce ego-noise using only a microphone array and less than two minute of noise recordings. Using Principal Component Analysis (PCA), the best covariance matrix candidate is selected from a dictionary created online during calibration and used with the Minimum Variance Distortionless Response (MVDR) beamformer. Results show that the proposed method runs in real-time, improves the signal-to-distortion ratio (SDR) by up to 10 dB, decreases the word error rate (WER) by 55% in some cases and increases the Average Precision (AP) of event detection by up to 0.2.
|
| |
| 15:30-17:00, Paper MoBIP-16.4 | Add to My Program |
| Extracting Dynamic Navigation Goal from Natural Language Dialogue |
|
| Liang, Lanjun | Shanghai Institute of Technology |
| Bian, Ganghui | Yantai University, Yantai, P.R. China |
| Zhao, Huailin | Shanghai Institute of Technology |
| Dong, Yanzhi | Yantai University |
| Liu, Huaping | Tsinghua University |
Keywords: AI-Enabled Robotics, Natural Dialog for HRI, Human-Robot Collaboration
Abstract: Effective access to relevant environmental changes in large human environments is critical for service robots to perform tasks. Since the position of a dynamic goal such as a human is variable, it will be difficult for the robot to locate him accurately. It is worth noting that humans can obtain information through social software, and deal with daily affairs. The current robots search for targets without considering some implicit information changes, which leads to not searching for the target objects in the end. Therefore, we propose to extract human implicit location change information from group chats dialogues, i.e., watching dialogues in group chats and extracting who, when, and where(3W), to assist robots in finding explicit character targets. Then we propose a dynamic spatio-temporal map(DSTM) to store the change information as knowledge for the robot. When the robot identifies a target person, it needs to follow the changing information in the scene to infer the possible location and probability of the target person, and then develop a search strategy. We deployed our framework on a custom mobile robot and performed instruction navigation tasks in a university building to evaluate our approach. We demonstrate the ability of our framework to collect and use information in a large human social environment.
|
| |
| 15:30-17:00, Paper MoBIP-16.5 | Add to My Program |
| TidyBot: Personalized Robot Assistance with Large Language Models |
|
| Wu, Jimmy | Princeton University |
| Antonova, Rika | Stanford University |
| Kan, Adam | The Nueva School |
| Lepert, Marion | Stanford University |
| Zeng, Andy | Google DeepMind |
| Song, Shuran | Columbia University |
| Bohg, Jeannette | Stanford University |
| Rusinkiewicz, Szymon | Princeton University |
| Funkhouser, Thomas A. | Princeton University |
Keywords: Service Robotics, Mobile Manipulation, AI-Enabled Robotics
Abstract: For a robot to personalize physical assistance effectively, it must learn user preferences that can be generally reapplied to future scenarios. In this work, we investigate personalization of household cleanup with robots that can tidy up rooms by picking up objects and putting them away. A key challenge is determining the proper place to put each object, as people's preferences can vary greatly depending on personal taste or cultural background. For instance, one person may prefer storing shirts in the drawer, while another may prefer them on the shelf. We aim to build systems that can learn such preferences from just a handful of examples via prior interactions with a particular person. We show that robots can combine language-based planning and perception with the few-shot summarization capabilities of large language models (LLMs) to infer generalized user preferences that are broadly applicable to future interactions. This approach enables fast adaptation and achieves 91.2% accuracy on unseen objects in our benchmark dataset. We also demonstrate our approach on a real-world mobile manipulator called TidyBot, which successfully puts away 85.0% of objects in real-world test scenarios.
|
| |
| 15:30-17:00, Paper MoBIP-16.6 | Add to My Program |
| L3MVN: Leveraging Large Language Models for Visual Target Navigation |
|
| Yu, Bangguo | University of Groningen |
| Kasaei, Hamidreza | University of Groningen |
| Cao, Ming | University of Groningen |
Keywords: Vision-Based Navigation, AI-Enabled Robotics, Service Robotics
Abstract: Visual target navigation in unknown environments is a crucial problem in robotics. Despite extensive investigation of classical and learning-based approaches in the past, robots lack common-sense knowledge about household objects and layouts. Prior state-of-the-art approaches to this task rely on learning the priors during the training and typically require significant expensive resources and time for learning. To address this, we propose a new framework for visual target navigation that leverages Large Language Models (LLM) to impart common sense for object searching. Specifically, we introduce two paradigms: (i) zero-shot and (ii) feed-forward approaches that use language to find the relevant frontier from the semantic map as a long-term goal and explore the environment efficiently. Our analyses demonstrate the notable zero-shot generalization and transfer capabilities from the use of language. Experiments on Gibson and Habitat-Matterport 3D (HM3D) demonstrate that the proposed framework significantly outperforms existing map-based methods in terms of success rate and generalization. Ablation analysis also indicates that the common-sense knowledge from the language model leads to more efficient semantic exploration. Finally, we provide a real robot experiment to verify the applicability of our framework in real-world scenarios. The supplementary video and code can be accessed via the following link: https://sites.google.com/view/l3mvn.
|
| |
| 15:30-17:00, Paper MoBIP-16.7 | Add to My Program |
| TopSpark: A Timestep Optimization Methodology for Energy-Efficient Spiking Neural Networks on Autonomous Mobile Agents |
|
| Putra, Rachmad Vidya Wicaksana | Technische Universit�t Wien (TU Wien) |
| Shafique, Muhammad | New York University Abu Dhabi |
Keywords: AI-Enabled Robotics, Engineering for Robotic Systems, Autonomous Agents
Abstract: Autonomous mobile agents (e.g., mobile ground robots and UAVs) typically require low-power/energy-efficient machine learning (ML) algorithms to complete their ML-based tasks (e.g., object recognition) while adapting to diverse environments, as mobile agents are usually powered by batteries. These requirements can be fulfilled by Spiking Neural Networks (SNNs) as they offer low power/energy processing due to their sparse computations and efficient online learning with bio-inspired learning mechanisms for adapting to different environments. Recent works studied that the energy consumption of SNNs can be optimized by reducing the computation time of each neuron for processing a sequence of spikes (i.e., timestep). However, state-of-the-art techniques rely on intensive design searches to determine fixed timestep settings for only the inference phase, thereby hindering the SNN systems from achieving further energy efficiency gains in both the training and inference phases. These techniques also restrict the SNN systems from performing efficient online learning at run time. Toward this, we propose TopSpark, a novel methodology that leverages adaptive timestep reduction to enable energy-efficient SNN processing in both the training and inference phases, while keeping its accuracy close to the accuracy of SNNs without timestep reduction. The key ideas of our TopSpark include: (1) analyzing the impact of different timestep settings on the accuracy; (2) identifying neuron parameters that have a significant impact on accuracy in different timesteps; (3) employing parameter enhancements that make SNNs effectively perform learning and inference using less spiking activity due to reduced timesteps; and (4) developing a strategy to trade-off accuracy, latency, and energy to meet the design requirements. The experimental results show that, our TopSpark saves the SNN latency by 3.9x as well as energy consumption by 3.5x for training and 3.3x for inference on average, across different network sizes, learning rules, and workloads, while maintaining the accuracy within 2% of that of SNNs without timestep reduction. In this manner, TopSpark enables low-power/energy-efficient SNN processing for autonomous mobile agents.
|
| |
| 15:30-17:00, Paper MoBIP-16.8 | Add to My Program |
| Generating Executable Action Plans with Environmentally-Aware Language Models |
|
| Gramopadhye, Maitrey | University of North Carolina at Chapel Hill |
| Szafir, Daniel J. | University of North Carolina at Chapel Hill |
Keywords: AI-Enabled Robotics, Deep Learning Methods, Task Planning
Abstract: Large Language Models (LLMs) trained using massive text datasets have recently shown promise in generating action plans for robotic agents from high level text queries. However, these models typically do not consider the robot�s environment, resulting in generated plans that may not actually be executable, due to ambiguities in the planned actions or environmental constraints. In this paper, we propose an approach to generate environmentally-aware action plans that agents are better able to execute. Our approach involves integrating environmental objects and object relations as additional inputs into LLM action plan generation to provide the system with an awareness of its surroundings, resulting in plans where each generated action is mapped to objects present in the scene. We also design a novel scoring function that, along with generating the action steps and associating them with objects, helps the system disambiguate among object instances and take into account their states. We evaluated our approach using the VirtualHome simulator and the ActivityPrograms knowledge base and found that action plans generated from our system had a 310% improvement in executability and a 147% improvement in correctness over prior work. The complete code and a demo of our method is publicly available at https://github.com/hri-ironlab/scene_aware_language_planner.
|
| |
| 15:30-17:00, Paper MoBIP-16.9 | Add to My Program |
| Interaction-Aware and Hierarchically-Explainable Heterogeneous Graph-Based Imitation Learning for Autonomous Driving Simulation |
|
| Tabatabaie, Mahan | University of Connecticut |
| He, Suining | University of Connecticut |
| Shin, Kang G. | University of Michigan |
Keywords: Representation Learning, Learning from Demonstration, Imitation Learning
Abstract: Understanding and learning the actor-to-X interactions (AXIs), such as those between the focal vehicles (actor) and other traffic participants (e.g., other vehicles, pedestrians) as well as traffic environments (e.g., city/road map), is essential for development of a decision-making model and simulation of autonomous driving (AD). Existing practices on imitation learning (IL) for AD simulation, despite the advances in the model learnability, have not accounted for fusing and differentiating the heterogeneous AXIs in complex road environments. Furthermore, how to further explain the hierarchical structures within the complex AXIs remains largely under-explored. To overcome these challenges, we propose HGIL, an interaction-aware and hierarchically-explainable Heterogeneous Graph-based Imitation Learning approach for AD simulation. We have designed a novel heterogeneous interaction graph (HIG) to provide local and global representation as well as awareness of the AXIs. Integrating the HIG as the state embeddings, we have designed a hierarchically-explainable generative adversarial imitation learning approach, with local sub-graph and global cross-graph attention, to capture the interaction behaviors and driving decision-making processes. Our data-driven simulation and explanation studies have corroborated the accuracy and explainability of HGIL in learning and capturing the complex AXIs.
|
| |
| 15:30-17:00, Paper MoBIP-16.10 | Add to My Program |
| Zero-Shot Fault Detection for Manipulators through Bayesian Inverse Reinforcement Learning |
|
| Zhao, Hanqing | McGill University |
| Liu, Xue | McGill University |
| Dudek, Gregory | McGill University |
Keywords: Failure Detection and Recovery, Learning from Experience, Robust/Adaptive Control
Abstract: We consider the detection of faults in robotic manipulators, with particular emphasis on faults that have not been observed or identified in advance, which naturally includes those that occur very infrequently. Recent studies indicate that the reward function obtained through Inverse Reinforcement Learning (IRL) can help detect anomalies caused by faults in a control system (i.e. fault detection). Current IRL methods for fault detection, however, either use a linear reward representation or require extensive sampling from the environment to estimate the policy, rendering them inappropriate for safety-critical situations where sampling of failure observations via fault injection can be expensive and dangerous. To address this issue, this paper proposes a zero-shot and exogenous fault detector based on an approximate variational reward imitation learning (AVRIL) structure. The fault detector recovers a reward signal as a function of externally observable information to describe the normal operation, which can then be used to detect anomalies caused by faults. Our method incorporates expert knowledge through a customizable reward prior distribution, allowing the fault detector to learn the reward solely from normal operation samples, without the need for a simulator or costly interactions with the environment. We evaluate our approach for exogenous partial fault detection in multi-stage robotic manipulator tasks, comparing it with several baseline methods. The results demonstrate that our method more effectively identifies unseen faults even when they occur within just three controller time steps.
|
| |
| 15:30-17:00, Paper MoBIP-16.11 | Add to My Program |
| Chat with the Environment: Interactive Multimodal Perception Using Large Language Models |
|
| Zhao, Xufeng | Universit�t Hamburg |
| Li, Mengdi | University of Hamburg |
| Weber, Cornelius | Knowledge Technology Group, University of Hamburg |
| Hafez, Muhammad Burhan | University of Hamburg |
| Wermter, Stefan | University of Hamburg |
Keywords: AI-Enabled Robotics, Multi-Modal Perception for HRI, AI-Based Methods
Abstract: Programming robot behavior in a complex world faces challenges on multiple levels, from dextrous low-level skills to high-level planning and reasoning. Recent pre-trained Large Language Models (LLMs) have shown remarkable reasoning ability in few-shot robotic planning. However, it remains challenging to ground LLMs in multimodal sensory input and continuous action output, while enabling a robot to interact with its environment and acquire novel information as its policies unfold. We develop a robot interaction scenario with a partially observable state, which necessitates a robot to decide on a range of epistemic actions in order to sample sensory information among multiple modalities, before being able to execute the task correctly. An interactive perception framework is therefore proposed with an LLM as its backbone, whose ability is exploited to instruct epistemic actions and to reason over the resulting multimodal sensations (vision, sound, haptics, proprioception), as well as to plan an entire task execution based on the interactively acquired information. Our study demonstrates that LLMs can provide high-level planning and reasoning skills and control interactive robot behavior in a multimodal environment, while multimodal modules with the context of the environmental state help ground the LLMs and extend their processing ability. The project website can be found at https://matcha-model.github.io.
|
| |
| 15:30-17:00, Paper MoBIP-16.12 | Add to My Program |
| Reinforcement Learning for Robot Navigation with Adaptive Forward Simulation Time (AFST) in a Semi-Markov Model |
|
| Chen, Yu'an | University of Science and Technology of China |
| Ruosong, Ye | University of Science and Technology of China |
| Tao, Ziyang | University of Science and Technology of China |
| Liu, Hongjian | University of Science and Technology of China |
| Chen, Guangda | NetEase |
| Peng, Jie | University of Science and Technology of China |
| Ma, Jun | University of Science and Technology of China |
| Zhang, Yu | University of Science and Technology of China |
| Ji, Jianmin | University of Science and Technology of China |
| Zhang, Yanyong | University of Science and Technology of China |
Keywords: Learning from Experience
Abstract: Deep reinforcement learning (DRL) algorithms have proven effective in robot navigation, especially in unknown environments, by directly mapping perception inputs into robot control commands. However, most existing methods ignore the local minimum problem in navigation and thereby cannot handle complex unknown environments. In this paper, we propose the first DRL-based navigation method modeled by a semi-Markov decision process (SMDP) with continuous action space, named Adaptive Forward Simulation Time (AFST), to overcome this problem. Specifically, we reduce the dimensions of the action space and improve the distributed proximal policy optimization (DPPO) algorithm for the specified SMDP problem by modifying its GAE to better estimate the policy gradient in SMDPs. Experiments in various unknown environments demonstrate the effectiveness of AFST.
|
| |
| 15:30-17:00, Paper MoBIP-16.13 | Add to My Program |
| A Hybrid Reinforcement Learning Approach with a Spiking Actor Network for Efficient Robotic Arm Target Reaching |
|
| Oikonomou, Katerina Maria | Democritus University of Thrace |
| Kansizoglou, Ioannis | Democritus University of Thrace |
| Gasteratos, Antonios | Democritus University of Thrace |
Keywords: Bioinspired Robot Learning, Reinforcement Learning, Mobile Manipulation
Abstract: The increasing demand for applications in competitive fields, such as assisted living and aerial robots, drives contemporary research into the development, implementation and integration of power-constrained solutions. Although, deep neural networks (DNNs) have achieved remarkable performances in many robotics applications, energy consumption remains a major limitation. The paper at hand proposes a hybrid variation of the well-established deep deterministic policy gradient (DDPG) reinforcement learning approach to train a 6 degree of freedom robotic arm in the target-reach task. In particular, we introduce a spiking neural network (SNN) for the actor model and a DNN for the critic one, aiming to find an optimal set of actions for the robot. The deep critic network is employed only during training and discarded afterwards, allowing the deployment of the SNN in neuromorphic hardware for inference. The agent is supported by a combination of RGB and laser scan data exploited for collision avoidance and object detection. We compare the hybrid-DDPG model against a classic DDPG one, demonstrating the superiority of our approach.
|
| |
| 15:30-17:00, Paper MoBIP-16.14 | Add to My Program |
| AR3n: A Reinforcement Learning-Based Assist-As-Needed Controller for Robotic Rehabilitation (I) |
|
| Pareek, Shrey | Cargill |
| Nisar, Harris | University of Illinois at Urbana Champaign |
| Kesavadas, Thenkurussi | University of Illinois at Urbana-Champaign |
Keywords: AI-Enabled Robotics, Rehabilitation Robotics, Reinforcement Learning
Abstract: In this paper, we present AR3n (pronounced as Aaron), an assist-as-needed (AAN) controller that utilizes reinforcement learning to supply adaptive assistance during a robot assisted handwriting rehabilitation task. AR3n uses the soft actor-critic reinforcement learning algorithm to derive a model-free controller for upper limb stroke rehabilitation. Unlike previous AAN controllers, our method does not require manual-tuning of controller parameters or the need for patient specific physical models. We propose the use of a virtual patient model to generalize AR3n across multiple subjects. The system modulates robotic impedance based on a subject's tracking error, while minimizing the amount of robotic assistance. It delivers stable realtime assistance and prevents over-reliance on robotic assistance. The controller is experimentally validated through a set of simulations and human subject experiments. We compare our system to traditional rule-based controllers and a Learning-from-Demonstration controller previously proposed by our group. Finally, we demonstrate the efficacy and superiority of AR3n over rule-based controllers through a human subject study.
|
| |
| MoBIP-17 Regular session, Hall E |
Add to My Program |
| Clone of 'Learning from Demonstration' |
|
| |
| |
| 15:30-17:00, Paper MoBIP-17.1 | Add to My Program |
| PACT: Perception-Action Causal Transformer for Autoregressive Robotics Pre-Training |
|
| Bonatti, Rogerio | Microsoft |
| Vemprala, Sai | Microsoft Corporation |
| Ma, Shuang | Microsoft |
| Vieira Frujeri, Felipe | Microsoft |
| Chen, Shuhang | Microsoft |
| Kapoor, Ashish | MicroSoft |
Keywords: Representation Learning, Learning from Demonstration, Transfer Learning
Abstract: Robotics has long been a field riddled with complex systems architectures whose modules and connections, whether traditional or learning-based, require significant human expertise and prior knowledge. Inspired by large pre-trained language models, this work introduces a paradigm for pre-training a general purpose representation that can serve as a starting point for multiple tasks on a given robot. We present the Perception-Action Causal Transformer (PACT), a generative transformer-based architecture that aims to build representations directly from robot data in a self-supervised fashion. Through autoregressive prediction of states and actions over time, our model implicitly encodes dynamics and behaviors for a particular robot. Our experimental evaluation focuses on the domain of mobile agents, where we show that this robot-specific representation can function as a single starting point to achieve distinct tasks such as safe navigation, localization and mapping. We evaluate two form factors: a wheeled robot that uses a LiDAR sensor as perception input (MuSHR), and a simulated agent that uses first-person RGB images (Habitat). We show that finetuning small task-specific networks on top of the larger pretrained model results in significantly better performance compared to training a single model from scratch for all tasks simultaneously, and comparable performance to training a separate large model for each task independently. By sharing a common good-quality representation across tasks we can lower overall model capacity and speed up the real-time deployment of such systems. Open-sourced code: https://github.com/microsoft/PACT Video: https://youtu.be/mNQvQu_atuw
|
| |
| 15:30-17:00, Paper MoBIP-17.2 | Add to My Program |
| Learning from Sparse Demonstrations (I) |
|
| Jin, Wanxin | Arizona State University |
| Murphey, Todd | Northwestern University |
| Kulic, Dana | Monash University |
| Ezer, Neta | Northrop Grumman Corporation |
| Mou, Shaoshuai | Purdue University |
Keywords: Learning from Demonstration, Optimization and Optimal Control, Motion and Path Planning, Inverse Reinforcement Learning
Abstract: This paper develops the Continuous Pontryagin Differentiable Programming (Continuous PDP) method that enables a robot to learn an objective function from a few number of sparsely demonstrated keyframes. The keyframes are few desired sequential outputs that a robot is wanted to follow at certain time steps. The time span of the keyframes can be different from that of the robot�s actual execution. The method jointly searches for an objective function and a time-warping function such that the robot�s resulting motion sequentially follows the keyframes with minimal discrepancy loss. Continuous PDP minimizes the discrepancy loss using projected gradient descent, by efficiently solving the gradient of robot motion with respect to the unknown parameters. The method is first evaluated on a simulated robot arm, and then applied to a 6-DoF maneuvering quadrotor to learn an objective function for motion planning in un-modeled environments. The results show the efficiency of the method, its ability to handle time misalignment between the keyframes and robot execution, and the generalization of objective learning into unseen motion conditions.
|
| |
| 15:30-17:00, Paper MoBIP-17.3 | Add to My Program |
| Neural Field Movement Primitives for Joint Modelling of Scenes and Motions |
|
| Tekden, Ahmet | Chalmers University of Technology |
| Deisenroth, Marc Peter | University College London |
| Bekiroglu, Yasemin | Chalmers University of Technology, University College London |
Keywords: Learning from Demonstration, Representation Learning, Deep Learning in Grasping and Manipulation
Abstract: This paper presents a novel Learning from Demonstration (LfD) method that uses neural fields to learn new skills efficiently and accurately. It achieves this by utilizing a shared embedding to learn both scene and motion representations in a generative way. Our method smoothly maps each expert demonstration to a scene-motion embedding and learns to model them without requiring hand-crafted task parameters or large datasets. It achieves data efficiency by enforcing scene and motion generation to be smooth with respect to changes in the embedding space. At inference time, our method can retrieve scene-motion embeddings using test time optimization, and generate precise motion trajectories for novel scenes. The proposed method is versatile and can employ images, 3D shapes, and any other scene representations that can be modeled using neural fields. Additionally, it can generate both end-effector positions and joint angle-based trajectories. Our method is evaluated on tasks that require accurate motion trajectory generation, where the underlying task parametrization is based on object positions and geometric scene changes. Experimental results demonstrate that the proposed method outperforms the baseline approaches and generalizes to novel scenes. Furthermore, in real-world experiments, we show that our method can successfully model multi-valued trajectories, it is robust to the distractor objects introduced at inference time, and it can generate 6D motions.
|
| |
| 15:30-17:00, Paper MoBIP-17.4 | Add to My Program |
| Augmentation Enables One-Shot Generalization in Learning from Demonstration for Contact-Rich Manipulation |
|
| Li, Xing | TU Berlin |
| Baum, Manuel | TU Berlin |
| Brock, Oliver | Technische Universit�t Berlin |
Keywords: Learning from Demonstration, Imitation Learning
Abstract: We introduce a Learning from Demonstration (LfD) approach for contact-rich manipulation tasks, i.e., tasks in which the manipulandum's motion is constrained by contact with the environment. Our approach is motivated by the insight that even a large number of demonstrations will often not contain sufficient information to obtain a general policy for the task. To obtain general policies, our approach emph{augments} the information contained in a single demonstration. This autonomous augmentation is based on the insight that environmental constraints play a central role in generalization. We validate our approach in real-world experiments with mechanisms with multiple, interdependent articulations, including latch locks, chain locks, and drawers with handles. The extracted policies, obtained from a single emph{augmented} human demonstration, generalize to different mechanisms of the same type and in varying environmental settings.
|
| |
| 15:30-17:00, Paper MoBIP-17.5 | Add to My Program |
| Using Single Demonstrations to Define Autonomous Manipulation Contact Tasks in Unstructured Environments Via Object Affordances |
|
| Regal, Frank | The University of Texas at Austin |
| Pettinger, Adam | The University of Texas at Austin |
| Duncan, John Alexander | The University of Texas at Austin |
| Parra, Fabian | University of Texas at Austin |
| Akita, Emmanuel | The University of Texas at Austin |
| Navarro, Alex | University of Texas at Austin |
| Pryor, Mitchell | University of Texas |
Keywords: Learning from Demonstration, Task and Motion Planning, Virtual Reality and Interfaces
Abstract: Performing a manipulation contact task in an unknown and unstructured environment is still a challenge. Learning from Demonstration (LfD) techniques provide an intuitive means to define difficult-to-model contact tasks, but have attributes that make them undesirable for novice users in uncertain environments. We present a novel end-to-end system that captures a single manipulation task demonstration from an augmented reality (AR) head-mounted display (HMD), computes an affordance primitive (AP) representation of the task, and sends the task parameters to a mobile manipulator for execution in real-time. Using an AR HMD for task demonstration and APs for task representation has several distinct advantages. AR task demonstration is intuitive, practical, and can be accomplished without requiring sensor installment in the task environment. APs provide a compact and legible task representation, enabling scalability, generalization, and modification of the task without significant data processing overhead. In this effort, we demonstrate system generalization with 10 object manipulation tasks, confirming the computed parameters from all tasks fit within AP tolerances. Secondly, we evaluate a mobile manipulator robot's ability to perform human-demonstrated tasks using AP representation. To increase robustness, we devised and tested four methods to correct for inherent, irreducible position errors in the system. A final study shows the system has a manipulation success rate of 96% from a single manipulation demonstration on an industrial wheel valve.
|
| |
| 15:30-17:00, Paper MoBIP-17.6 | Add to My Program |
| Constrained Dynamic Movement Primitives for Collision Avoidance in Novel Environments |
|
| Shaw, Seiji | Massachusetts Institute of Technology |
| Jha, Devesh | Mitsubishi Electric Research Laboratories |
| Raghunathan, Arvind | Mitsubishi Electric Research Laboratories |
| Corcodel, Radu Ioan | Mitsubishi Electric Research Laboratories |
| Romeres, Diego | Mitsubishi Electric Research Laboratories |
| Konidaris, George | Brown University |
| Nikovski, Daniel | MERL |
Keywords: Learning from Demonstration, Robot Safety, Collision Avoidance
Abstract: Dynamic movement primitives are widely used for learning skills that can be demonstrated to a robot by a skilled human or controller. While their generalization capabilities and simple formulation make them very appealing to use, they possess no strong guarantees to satisfy operational safety constraints for a task. We present constrained dynamic movement primitives (CDMPs), which can allow for positional constraint satisfaction in the robot workspace. Our method solves a non-linear optimization to perturb an existing DMP�s forcing weights to admit a Zeroing Barrier Function (ZBF), which certifies positional workspace constraint satisfaction. We demonstrate our approach under different positional constraints on the end-effector movement on multiple physical robots, such as obstacle avoidance and workspace limitations.
|
| |
| 15:30-17:00, Paper MoBIP-17.7 | Add to My Program |
| Learning Constraints on Autonomous Behaviorfrom Proactive Feedback |
|
| Basich, Connor | University of Massachusetts Amherst |
| Mahmud, Saaduddin | University of Massachusetts Amherst |
| Zilberstein, Shlomo | University of Massachusetts |
Keywords: Learning from Demonstration, AI-Based Methods, Reinforcement Learning
Abstract: Learning from feedback is a common paradigm to acquire information that is hard to specify a priori. In this work, we consider an agent with a known nominal reward model that captures its high-level task objective. Furthermore, the agent operates subject to constraints that are unknown a priori and must be inferred from human interventions. Unlike existing methods, our approach does not rely on full or partial demonstration trajectories or assume a fully reactive human. Instead, we assume access only to sparse interventions, which may in fact be generated proactively by the human, and we only make minimal assumptions about the human. We provide both theoretical bounds on performance and empirical validations of our method. We show that our method enables an agent to learn a constraint set with high accuracy that generalizes well to new environments within a domain, whereas methods that only consider reactive feedback learn an incorrect constraint set that does not generalize well, making constraint violations more likely in new environments.
|
| |
| 15:30-17:00, Paper MoBIP-17.8 | Add to My Program |
| Learning Models of Adversarial Agent Behavior under Partial Observability |
|
| Ye, Sean | Georgia Institute of Technology |
| Natarajan, Manisha | Georgia Institute of Technology |
| Wu, Zixuan | Georgia Institute of Technology |
| Paleja, Rohan | Georgia Institute of Technology |
| Chen, Letian | Georgia Institute of Technology |
| Gombolay, Matthew | Georgia Institute of Technology |
Keywords: Learning from Demonstration, Deep Learning Methods, Representation Learning
Abstract: The need for opponent modeling and tracking arises in several real-world scenarios, such as professional sports, video game design, and drug-trafficking interdiction. In this work, we present GRaph based Adversarial Modeling with Mutual Information (GrAMMI) for modeling the behavior of an adversarial opponent agent. GrAMMI is a novel graph neural network (GNN) based approach that uses mutual information maximization as an auxiliary objective to predict the current and future states of an adversarial opponent with partial observability. To evaluate GrAMMI, we design two large-scale, pursuit-evasion domains inspired by real-world scenarios, where a team of heterogeneous agents is tasked with tracking and interdicting a single adversarial agent, and the adversarial agent must evade detection while achieving its own objectives. With the mutual information formulation, GrAMMI outperforms all baselines in both domains and achieves 31.68% higher log-likelihood on average for future adversarial state predictions across both domains.
|
| |
| 15:30-17:00, Paper MoBIP-17.9 | Add to My Program |
| Robust Real-Time Motion Retargeting Via Neural Latent Prediction |
|
| Wang, Tiantian | Zhejiang University |
| Zhang, Haodong | Zhejiang University |
| Chen, Lu | Zhejiang University |
| Wang, Dongqi | Zhejiang University |
| Wang, Yue | Zhejiang University |
| Xiong, Rong | Zhejiang University |
Keywords: Learning from Demonstration, Imitation Learning, Dual Arm Manipulation
Abstract: Human-robot motion retargeting is a crucial approach for fast learning motion skills. Achieving real-time retargeting demands high levels of synchronization and accuracy. Even though existing retargeting methods have swift calculation, they still cause time-delay effect on the synchronous retargeting. To mitigate this issue, this paper proposes a motion retargeting method guided by prediction, which effectively reduces the adverse impact of time-delay. The proposed pipeline contains motion retargeting in spatial temporal graph-based structure and motion prediction in the latent space. The motion sequence retargeting builds mapping and paired data from human poses to corresponding robot configurations for training prediction model, and generated robot motion satisfies limit and self-collision constrains. The controller guided by prediction imports future robot joint motion to achieve advanced trajectory tracking, thereby compensating for delay time spent on calculation and tracking. Experimental results show that our method outperforms other methods in terms of synchronization and similarity. Furthermore, our method exhibits fault-tolerant capability in scenarios involving the loss of human information input.
|
| |
| 15:30-17:00, Paper MoBIP-17.10 | Add to My Program |
| Deep Probabilistic Movement Primitives with a Bayesian Aggregator |
|
| Przystupa, Michael | University of Alberta |
| Haghverd, Faezeh | University of Alberta |
| Jagersand, Martin | University of Alberta |
| Tosatto, Samuele | University of Innsbruck |
Keywords: Learning from Demonstration, Imitation Learning, Probabilistic Inference
Abstract: Movement primitives are trainable parametric models that reproduce robotic movements starting from a limited set of demonstrations. Previous works proposed simple linear models that exhibited high sample efficiency and generalization power by allowing temporal modulation of movements (reproducing movements faster or slower), blending (merging two movements into one), via-point conditioning (constraining a movement to meet some particular via-points) and context conditioning (generation of movements based on an observed variable, e.g., position of an object). Previous works have proposed neural network based motor primitive models, having demonstrated their capacity to perform task with some forms of input conditioning or time-modulation representations. However, there has not been a single unified deep motor primitive�s model proposed that is capable of all previous operations, limiting neural motor primitive�s potential applications. This paper proposes a deep movement primitive architecture that encodes all the operations above and uses a Bayesian context aggregator that allows a more sound context conditioning and blending. Our results demonstrate our approach can scale to reproduce complex motions on a larger variety of input choices compared to baselines while maintaining operations of linear movement primitives provide.
|
| |
| 15:30-17:00, Paper MoBIP-17.11 | Add to My Program |
| Self-Supervised Visual Motor Skills Via Neural Radiance Fields |
|
| Gesel, Paul | University of New Hampshire |
| Sojib, Noushad | University of New Hampshire |
| Begum, Momotaz | University of New Hampshire |
Keywords: Learning from Demonstration, Imitation Learning, Deep Learning in Grasping and Manipulation
Abstract: In this paper, we propose a novel network architecture for visual imitation learning that exploits neural radiance fields (NeRFs) and key-point correspondence for self-supervised visual motor policy learning. The proposed network architecture incorporates a dynamic system output layer for policy learning. Combining the stability and goal adaption properties of dynamic systems with the robustness of keypoint-based correspondence yields a policy that is invariant to significant clutter, occlusions, lighting conditions changes, and spatial variations in goal configurations. Experiments on multiple manipulation tasks show that our method outperforms comparable visual motor policy learning methods on both in-distribution and out-of-distribution scenarios when using a small number of training samples.
|
| |
| 15:30-17:00, Paper MoBIP-17.12 | Add to My Program |
| Autonomous Ultrasound Scanning towards Standard Plane Using Interval Interaction Probabilistic Movement Primitives |
|
| Hu, Yi | University of Alberta |
| Tavakoli, Mahdi | University of Alberta |
Keywords: Learning from Demonstration, Imitation Learning, Surgical Robotics: Planning
Abstract: Learning from demonstrations is the paradigm where robots acquire new skills demonstrated by an expert and alleviate the physical burden on experts to perform repetitive tasks. Ultrasound scanning is one of the ways to view the anatomical structures of soft tissues, but it is repetitive for some tissue scanning tasks. In this study, an autonomous ultrasound scanning towards a standard plane framework is proposed. Interaction probabilistic movement primitives (iProMP) was proposed for the collaborative tasks for human and robot movement. Inspired by the interval type-2 fuzzy system, an interval iProMP is proposed to learn the ultrasound scanning navigation strategy from scanning demonstrations and the collaborative agents are the robot movement and ultrasound image information. The proposed interval iProMP improves the capacity of dealing with uncertainties due to insufficient observations during reproduction. U-Net is applied to recognize the desired ultrasound image shown during demonstrations and a confidence map is used to evaluate the ultrasound image quality. Breast seroma scanning is chosen as the ultrasound scanning task to validate the performance of the proposed autonomous ultrasound scanning framework. Ultrasound navigation is to realize autonomous ultrasound scanning for localizing the breast seroma. The simulation comparison result shows the better performance of the proposed interval iProMP under insufficient observation, compared to traditional iProMP. The experiment result validates the feasibility and generality of the proposed autonomous ultrasound scanning framework using interval iProMP with a higher success rate than that with traditional iProMP.
|
| |
| 15:30-17:00, Paper MoBIP-17.13 | Add to My Program |
| Learning Continuous Grasping Function with a Dexterous Hand from Human Demonstrations |
|
| Ye, Jianglong | UC San Diego |
| Wang, Jiashun | Carnegie Mellon University |
| Huang, Binghao | University of California, San Diego |
| Qin, Yuzhe | UC San Diego |
| Wang, Xiaolong | UC San Diego |
Keywords: Learning from Demonstration, Dexterous Manipulation, Deep Learning in Grasping and Manipulation
Abstract: We propose to learn to generate grasping motion for manipulation with a dexterous hand using implicit functions. With continuous time inputs, the model can generate a continuous and smooth grasping plan. We name the proposed model Continuous Grasping Function (CGF). CGF is learned via generative modeling with a Conditional Variational Autoencoder using 3D human demonstrations. We will first convert the large-scale human-object interaction trajectories to robot demonstrations via motion retargeting, and then use these demonstrations to train CGF. During inference, we perform sampling with CGF to generate different grasping plans in the simulator and select the successful ones to transfer to the real robot. By training on diverse human data, our CGF allows generalization to manipulate multiple objects. Compared to previous planning algorithms, CGF is more efficient and achieves significant improvement on success rate when transferred to grasping with the real Allegro Hand. Our project page is available at https://jianglongye.com/cgf/ .
|
| |
| 15:30-17:00, Paper MoBIP-17.14 | Add to My Program |
| Robot Programming by Demonstration: Trajectory Learning Enhanced by sEMG-Based User Hand Stiffness Estimation (I) |
|
| Biagiotti, Luigi | University of Modena and Reggio Emilia |
| Meattini, Roberto | University of Bologna |
| Chiaravalli, Davide | Alma Mater Studiorum, University of Bologna |
| Palli, Gianluca | University of Bologna |
| Melchiorri, Claudio | University of Bologna |
Keywords: Learning from Demonstration, Motion and Path Planning, Physical Human-Robot Interaction, Control Architectures and Programming
Abstract: Trajectory learning is one of the key components of robot Programming by Demonstration (PdB) approaches, which in many cases, especially in industrial practice, aim at defining complex manipulation patterns. In order to enhance these methods, which are generally based on a physical interaction between the user and the robot, guided along the desired path, an additional input channel is considered in this work. The hand stiffness, that the operator continuously modulates during the demonstration, is estimated from the forearm surface electromyography (sEMG) and translated into a request for a higher or lower accuracy level. Then, a constrained optimization problem is built (and solved) in the framework of smoothing B-splines to obtain a minimum curvature trajectory approximating, in this manner, the taught path within the precision imposed by the user. Experimental tests in different applicative scenarios, involving both position and orientation, prove the benefits of the proposed approach in terms of intuitiveness of the programming procedure for the human operator and characteristics of the final motion.
|
| |
| MoBIP-18 Regular session, Hall E |
Add to My Program |
| Clone of 'Human Detection and Pose' |
|
| |
| |
| 15:30-17:00, Paper MoBIP-18.1 | Add to My Program |
| Automated Key Action Detection for Closed Reduction of Pelvic Fractures by Expert Surgeons in Robot-Assisted Surgery |
|
| Pan, Ming Zhang | Guang Xi University |
| Deng, Ya-Wen | Guangxi University |
| Li, Zhen | Institute of Automation, Chinese Academy of Sciences |
| Chen, Yuan | Guangxi University |
| Liao, Xiao-Lan | Guangxi University |
| Bian, Gui-Bin | Institute of Automation, Chinese Academy of Sciences |
Keywords: Gesture, Posture and Facial Expressions, Intention Recognition
Abstract: Pelvic fractures are one of the most serious traumas in orthopedics, and the technical proficiency and expertise of the surgical team strongly influence the quality of reduction results. With the advancement of information technology and robotics, robot-assisted pelvic fracture reduction surgery is expected to reduce the impact caused by inexperienced doctors and improve the accuracy and stability of pelvic reduction. However, this requires the robot to detect key surgeon actions from time-series data, enabling the robot to independently perceive the surgical status, predict the surgeon's intentions, assess the demonstrated level of professional competence, and assess the progress of the surgery. Therefore, a multi-task deep learning neural network architecture is proposed, which incorporates Convolutional Neural Network- Bidirectional Long Short-Term Memory (CNN-BiLSTM) along with tri-modality fusion and feature extraction techniques. The proposed framework aims to achieve key action detection in closed reduction operations for pelvic fractures. Subsequently, a trimodal fine-grained dataset was constructed, wherein 29, 32, and 14 labels were marked on flexion, position, and pressure data for 14 key closed reduction actions. The experimental results show that the correct detection rate of closed reduction actions is 92.3%, significantly higher than the commonly used recognition algorithms. This work provides a method for the robot to learn the surgeon's professional knowledge, provides the basis for the operation's motion perception, and contributes to the autonomy of the robot-assisted closed reduction surgery of pelvic fractures.
|
| |
| 15:30-17:00, Paper MoBIP-18.2 | Add to My Program |
| LAMP: Leveraging Language Prompts for Multi-Person Pose Estimation |
|
| Hu, Shengnan | University of Central Florida |
| Zheng, Ce | University of Central Florida |
| Zhou, Zixiang | University of Central Florida |
| Chen, Chen | University of Central Florida |
| Sukthankar, Gita | University of Central Florida |
Keywords: Gesture, Posture and Facial Expressions, Deep Learning for Visual Perception, Human Detection and Tracking
Abstract: Human-centric visual understanding is an important desideratum for effective human-robot interaction. In order to navigate crowded public places, social robots must be able to interpret the activity of the surrounding humans. This paper addresses one key aspect of human-centric visual understanding, multi-person pose estimation. Achieving good performance on multi-person pose estimation in crowded scenes is difficult due to the challenges of occluded joints and instance separation. In order to tackle these challenges and overcome the limitations of image features in representing invisible body parts, we propose a novel prompt-based pose inference strategy called LAMP, Language Assisted Multi-person Pose estimation. By utilizing the text representations generated by a well-trained language model (CLIP), LAMP can facilitate the understanding of poses on the instance and joint levels, and learn more robust visual representations that are less susceptible to occlusion. This paper demonstrates that language-supervised training boosts the performance of single-stage multi-person pose estimation, and both instance-level and joint-level prompts are valuable for training. The code is available at https://github.com/shengnanh20/LAMP.
|
| |
| 15:30-17:00, Paper MoBIP-18.3 | Add to My Program |
| Detecting Changes in Functional State: A Comparative Analysis Using Wearable Sensors and a Sensorized Tip |
|
| Otamendi, Janire | University of the Basque Country UPV/EHU |
| Zubizarreta, Asier | University of the Basque Country (UPV/EHU) |
Keywords: Medical Robots and Systems
Abstract: Gait analysis can provide relevant information about the physical and neurological conditions of individuals. For this reason, several studies have recently been carried out in an attempt to monitor people's gait and automatically detect gait anomalies. Among the various monitoring systems available for gait analysis, wearable sensors are considered the gold standard due to their wide capture range and low cost. However, in the case of people that require assistive devices for walking, some studies have proposed the use of sensorized devices in order to minimize invasiveness. Nevertheless, there is still a lack of comparative works that evaluate the performance of sensorized assistive devices for walking with widely used wearable sensors. Hence, this paper presents a comparison between the performance of accelerometer-based wearable sensors and a sensorized tip developed by the authors to detect gait anomalies. The comparative study has been carried out in a controlled environment with five healthy subjects, in which three different physical states have been simulated. A machine-learning based anomaly detection approach has been implemented based on the data captured by a set of wearable sensors and the sensorized tip, and the overall performance of both monitoring systems has been evaluated. Results show that even if both devices can provide an average accuracy of more than 80% in gait anomaly detection, the sensorized tip provides better performance.
|
| |
| 15:30-17:00, Paper MoBIP-18.4 | Add to My Program |
| DiffuPose: Monocular 3D Human Pose Estimation Via Denoising Diffusion Probabilistic Model |
|
| Choi, Jeongjun | Seoul National University |
| Shim, Dongseok | Seoul National University |
| Kim, H. Jin | Seoul National University |
Keywords: Human Detection and Tracking, Visual Learning, Deep Learning Methods
Abstract: Thanks to the development of 2D keypoint detectors, monocular 3D human pose estimation (HPE) via 2D-to-3D uplifting approaches have achieved remarkable improvements. Still, monocular 3D HPE is a challenging problem due to the inherent depth ambiguities and occlusions. To handle this problem, many previous works exploit temporal information to mitigate such difficulties. However, there are many real-world applications where frame sequences are not accessible. This paper focuses on reconstructing a 3D pose from a single 2D keypoint detection. Rather than exploiting temporal information, we alleviate the depth ambiguity by generating multiple 3D pose candidates which can be mapped to an identical 2D keypoint. We build a novel diffusion-based framework to effectively sample diverse 3D poses from an off-the-shelf 2D detector. By considering the correlation between human joints by replacing the conventional denoising U-Net with graph convolutional network, our approach accomplishes further performance improvements. We evaluate our method on the widely adopted Human3.6M and HumanEva-I datasets. Comprehensive experiments are conducted to prove the efficacy of the proposed method, and they confirm that our model outperforms state-of-the-art multi-hypothesis 3D HPE methods.
|
| |
| 15:30-17:00, Paper MoBIP-18.5 | Add to My Program |
| BodySLAM++: Fast and Tightly-Coupled Visual-Inertial Camera and Human Motion Tracking |
|
| Henning, Dorian Fritz | Imperial College London |
| Choi, Christopher | Imperial College London |
| Schaefer, Simon | Technical University of Munich |
| Leutenegger, Stefan | Technical University of Munich |
Keywords: Human Detection and Tracking, Modeling and Simulating Humans, Visual-Inertial SLAM
Abstract: Robust, fast, and accurate human state -- 6D pose, shape, and posture -- estimation remains a challenging problem. For real-world applications, the ability to estimate the human state in real-time is highly desirable. In this paper, we present BodySLAM++, a fast, efficient, and accurate human and camera state estimation framework relying on visual-inertial data. BodySLAM++ extends an existing visual-inertial state estimation framework, OKVIS2, to solve the dual task of estimating camera and human states simultaneously. Our system improves the accuracy of both human and camera state estimation with respect to baseline methods by 26% and 12%, respectively, and achieves real-time performance at 15+ frames per second on an Intel i7-model CPU. Experiments were conducted on a custom dataset containing both ground truth human and camera poses collected with an indoor motion tracking system.
|
| |
| 15:30-17:00, Paper MoBIP-18.6 | Add to My Program |
| Characterizing the Onset and Offset of Motor Imagery During Passive Arm Movements Induced by an Upper-Body Exoskeleton |
|
| Mitra, Kanishka | The University of Texas at Austin |
| Racz, Frigyes Samuel | The University of Texas at Austin |
| Kumar, Satyam | The University of Texas at Austin |
| Deshpande, Ashish | The University of Texas |
| Mill�n, Jos� del R. | The University of Texas at Austin |
Keywords: Brain-Machine Interfaces, Rehabilitation Robotics, Prosthetics and Exoskeletons
Abstract: Two distinct technologies have gained attention lately due to their prospects for motor rehabilitation: robotics and brain-machine interfaces (BMIs). Harnessing their combined efforts is a largely uncharted and promising direction that has immense clinical potential. However, a significant challenge is whether motor intentions from the user can be accurately detected using non-invasive BMIs in the presence of instrumental noise and passive movements induced by the rehabilitation exoskeleton. As an alternative to the straightforward continuous control approach, this study instead aims to characterize the onset and offset of motor imagery during passive arm movements induced by an upper-body exoskeleton to allow for the natural control (initiation and termination) of functional movements. Ten participants were recruited to perform kinesthetic motor imagery (MI) of the right arm while attached to the robot, simultaneously cued with LED indicating the initiation and termination of a goal-oriented reaching task. Using electroencephalogram signals, we built a decoder to detect the transition between i) rest and beginning MI and ii) maintaining and ending MI. Offline decoder evaluation achieved group average onset accuracy of 60.7% and 66.6% for offset accuracy, revealing that the start and stop of MI could be identified while attached to the robot. Furthermore, pseudo-online evaluation could replicate this performance, forecasting reliable online exoskeleton control in the future. Our approach showed that participants could produce quality and reliable sensorimotor rhythms regardless of noise or passive arm movements induced by wearing the exoskeleton, which opens new possibilities for BMI control of assistive devices.
|
| |
| 15:30-17:00, Paper MoBIP-18.7 | Add to My Program |
| CLiFF-LHMP: Using Spatial Dynamics Patterns for Long-Term Human Motion Prediction |
|
| Zhu, Yufei | �rebro University |
| Rudenko, Andrey | Robert Bosch GmbH |
| Kucner, Tomasz Piotr | Aalto University |
| Palmieri, Luigi | Robert Bosch GmbH |
| Arras, Kai Oliver | Bosch Research |
| Lilienthal, Achim J. | Orebro University |
| Magnusson, Martin | �rebro University |
Keywords: Human Detection and Tracking
Abstract: Human motion prediction is important for mobile service robots and intelligent vehicles to operate safely and smoothly around people. The more accurate predictions are, particularly over extended periods of time, the better a system can, e.g., assess collision risks and plan ahead. In this paper, we propose to exploit maps of dynamics (MoDs, a class of general representations of place-dependent spatial motion patterns, learned from prior observations) for long-term human motion prediction (LHMP). We present a new MoD-informed human motion prediction approach, named CLiFF-LHMP, which is data efficient, explainable, and insensitive to errors from an upstream tracking system. Our approach uses CLiFF-map, a specific MoD trained with human motion data recorded in the same environment. We bias a constant velocity prediction with samples from the CLiFF-map to generate multi-modal trajectory predictions. In two public datasets we show that this algorithm outperforms the state of the art for predictions over very extended periods of time, achieving 45% more accurate prediction performance at 50s compared to the baseline.
|
| |
| 15:30-17:00, Paper MoBIP-18.8 | Add to My Program |
| GloPro: Globally-Consistent Uncertainty-Aware 3D Human Pose Estimation & Tracking in the Wild |
|
| Schaefer, Simon | Technical University of Munich |
| Henning, Dorian Fritz | Imperial College London |
| Leutenegger, Stefan | Technical University of Munich |
Keywords: Modeling and Simulating Humans, Human and Humanoid Motion Analysis and Synthesis
Abstract: An accurate and uncertainty-aware 3D human body pose estimation is key to enabling truly safe but efficient human-robot interactions. Current uncertainty-aware methods in 3D human pose estimation are limited to predicting the uncertainty of the body posture, while effectively neglecting the body shape and root pose. In this work, we present GloPro, which to the best of our knowledge the first framework to predict an uncertainty distribution of a 3D body mesh including its shape, pose, and root pose, by efficiently fusing visual clues with a learned motion model. We demonstrate that it vastly outperforms state-of-the-art methods in terms of human trajectory accuracy in a world coordinate system (even in the presence of severe occlusions), yields consistent uncertainty distributions, and can run in real-time.
|
| |
| 15:30-17:00, Paper MoBIP-18.9 | Add to My Program |
| Anytime, Anywhere: Human Arm Pose from Smartwatch Data for Ubiquitous Robot Control and Teleoperation |
|
| Weigend, Fabian Clemens | Arizona State University |
| Sonawani, Shubham | Arizona State University |
| Michael, Drolet | Arizona State University |
| Ben Amor, Heni | Arizona State University |
Keywords: Multi-Modal Perception for HRI, Telerobotics and Teleoperation, Wearable Robotics
Abstract: This work devises an optimized machine learning approach for human arm pose estimation from a single smartwatch. Our approach results in a distribution of possible wrist and elbow positions, which allows for a measure of uncertainty and the detection of multiple possible arm posture solutions, i.e., multimodal pose distributions. Combining estimated arm postures with speech recognition, we turn the smartwatch into a ubiquitous, low-cost and versatile robot control interface. We demonstrate in two use-cases that this intuitive control interface enables users to swiftly intervene in robot behavior, to temporarily adjust their goal, or to train completely new control policies by imitation. Extensive experiments show that the approach results in a 40% reduction in prediction error over the current state-of-the-art and achieves a mean error of 2.56 cm for wrist and elbow positions.
|
| |
| 15:30-17:00, Paper MoBIP-18.10 | Add to My Program |
| Recognizing Real-World Intentions Using a Multimodal Deep Learning Approach with Spatial-Temporal Graph Convolutional Networks |
|
| Shi, Jiaqi | Osaka University, RIKEN |
| Liu, Chaoran | Riken |
| Ishi, Carlos Toshinori | RIKEN |
| Wu, Bowen | Osaka University; RIKEN |
| Ishiguro, Hiroshi | Osaka University |
Keywords: Intention Recognition, Deep Learning Methods, AI-Based Methods
Abstract: Identifying intentions is a critical task for comprehending the actions of others, anticipating their future behavior, and making informed decisions. However, it is challenging to recognize intentions due to the uncertainty of future human activities and the complex influence factors. In this work, we explore the method of recognizing intentions alluded under human behaviors in the real world, aiming to boost intelligent systems' ability to recognize potential intentions and understand human behaviors. We collect data containing real-world human behaviors before using a hand dispenser and a temperature scanner at the building entrance. These data are processed and labeled into intention categories. A questionnaire is conducted to survey the human ability in inferring the intentions of others. Skeleton data and image features are extracted inspired by the answer to the questionnaire. For skeleton-based intention recognition, we propose a spatial-temporal graph convolutional network that performs graph convolutions on both part-based graphs and adaptive graphs, which achieves the best performance compared with baseline models in the same task. A deep-learning-based method using multimodal features is proposed to automatically infer intentions, which is demonstrated to accurately predict intentions based on past behaviors in the experiment, significantly outperforming humans.
|
| |
| 15:30-17:00, Paper MoBIP-18.11 | Add to My Program |
| VADER: Vector-Quantized Generative Adversarial Network for Motion Prediction |
|
| Yasar, Mohammad | University of Virginia |
| Iqbal, Tariq | University of Virginia |
Keywords: Intention Recognition, Human Detection and Tracking, Human-Robot Teaming
Abstract: Human motion prediction is an essential component for enabling close-proximity human-robot collaboration. The task of accurately predicting human motion is non-trivial and is compounded by the variability of human motion and the presence of multiple humans in proximity. To address some of the open challenges in motion prediction, in this work, we propose VADER, a novel sequence learning algorithm that models past observed poses using a flexible discrete latent space. VADER introduces the concept of Vector Quantization for human motion prediction, enabling the learning of a discrete latent space without being restricted by any static prior. In addition, we propose a new objective function that uses the discriminator objective to penalize deviation of predicted motion from the ground-truth. Finally, to explicitly model interaction in multiple humans, we introduce a lightweight attention mechanism to condition per-agent prediction on the previous hidden states of all the agents. Our evaluation across three scenarios: single-agent, multi-agent, and human-robot collaboration shows that VADER outperformed all the state-of-the-art approaches, resulting in more feasible human poses that align better with the ground-truth. Finally, we conducted extensive ablation studies to emphasize the importance of the proposed modules.
|
| |
| 15:30-17:00, Paper MoBIP-18.12 | Add to My Program |
| SG-LSTM: Social Group LSTM for Robot Navigation through Dense Crowds |
|
| Bhaskara, Rashmi | Purdue University |
| Chiu, Maurice | Purdue University |
| Bera, Aniket | Purdue University |
Keywords: Human Detection and Tracking, Datasets for Human Motion
Abstract: As personal robots become increasingly accessible and affordable, their applications extend beyond large corporate warehouses and factories to operate in diverse, less controlled environments, where they interact with larger groups of people. In such contexts, ensuring not only safety and efficiency but also mitigating potential adverse psychological impacts on humans and adhering to unwritten social norms become paramount. In this research, we aim to address these challenges by developing a cutting-edge model capable of predicting pedestrian movements and interactions in crowded environments. To this end, we propose a novel approach called the Social Group Long Short-term Memory (SG-LSTM) model, which effectively captures the complexities of human group behavior and interactions within dense surroundings. By integrating social awareness into the LSTM architecture, our model achieves significantly enhanced trajectory predictions. The implementation of our SG-LSTM model empowers navigation algorithms to compute collision-free paths faster and with higher accuracy, particularly in complex and crowded scenarios. To foster further advancements in social navigation research, we contribute a substantial video dataset comprising labeled pedestrian groups, which we release to the broader research community. To thoroughly evaluate the performance of our approach, we conduct extensive experiments on multiple datasets, including ETH, Hotel, and MOT15. We compare various prediction approaches, such as LIN, LSTM, O-LSTM, and S-LSTM, and rigorously assess runtime performance.
|
| |
| 15:30-17:00, Paper MoBIP-18.13 | Add to My Program |
| SmartMocap: Joint Estimation of Human and Camera Motion Using Uncalibrated RGB Cameras |
|
| Saini, Nitin | Max Planck Institute for Intelligent Systems |
| Huang, Chun-Hao Paul | Max Planck Institute for Intelligent Systems, T�bingen |
| Black, Michael | Max Planck Institute for Intelligent Systems in T�bingen |
| Ahmad, Aamir | University of Stuttgart |
Keywords: Gesture, Posture and Facial Expressions, Human Detection and Tracking, Deep Learning for Visual Perception
Abstract: Markerless human motion capture (mocap) from multiple RGB cameras is a widely studied problem. Existing methods either need calibrated cameras or calibrate them relative to a static camera, which acts as the reference frame for the mocap system. The calibration step has to be done a priori for every capture session, which is a tedious process, and re-calibration is required whenever cameras are intentionally or accidentally moved. In this paper, we propose a mocap method which uses multiple static and moving extrinsically uncalibrated RGB cameras. The key components of our method are as follows. First, since the cameras and the subject can move freely, we select the ground plane as a common reference to represent both the body and the camera motions unlike existing methods which represent bodies in the camera coordinate. Second, we learn a probability distribution of short human motion sequences (~1sec) relative to the ground plane and leverage it to disambiguate between the camera and human motion. Third, we use this distribution as a motion prior in a novel multi-stage optimization approach to fit the SMPL human body model and the camera poses to the human body keypoints on the images. Finally, we show that our method can work on a variety of datasets ranging from aerial cameras to smartphones. It also gives more accurate results compared to the state-of-the-art on the task of monocular human mocap with a static camera. A video demo is available at https://tinyurl.com/yeykrb67 and our code is available at https://tinyurl.com/2p9rme9y .
|
| |
| MoBIP-19 Regular session, Hall E |
Add to My Program |
| Clone of 'Deep Learning Methods II' |
|
| |
| |
| 15:30-17:00, Paper MoBIP-19.1 | Add to My Program |
| Online Continual Learning for Robust Indoor Object Recognition |
|
| Michieli, Umberto | Samsung Research |
| Ozay, Mete | Samsung Research |
Keywords: Continual Learning, Incremental Learning, Learning Categories and Concepts
Abstract: Vision systems mounted on home robots need to interact with unseen classes in changing environments. Robots have limited computational resources, labelled data and storage capability. These requirements pose some unique challenges: models should adapt without forgetting past knowledge in a data- and parameter-efficient way. We characterize the problem as few-shot (FS) online continual learning (OCL), where robotic agents learn from a non-repeated stream of few-shot data updating only a few model parameters. Additionally, such models experience variable conditions at test time, where objects may appear in different poses (e.g., horizontal or vertical) and environments (e.g., day or night). To improve robustness of CL agents, we propose RobOCLe, which; 1) constructs an enriched feature space computing high order statistical moments from the embedded features of samples; and 2) computes similarity between high order statistics of the samples on the enriched feature space, and predicts their class labels. We evaluate robustness of CL models to train/test augmentations in various cases. We show that different moments allow RobOCLe to capture different properties of deformations, providing higher robustness with no decrease of inference speed.
|
| |
| 15:30-17:00, Paper MoBIP-19.2 | Add to My Program |
| PaintNet: Unstructured Multi-Path Learning from 3D Point Clouds for Robotic Spray Painting |
|
| Tiboni, Gabriele | Politecnico Di Torino |
| Camoriano, Raffaello | Politecnico Di Torino |
| Tommasi, Tatiana | Politecnico Di Torino |
Keywords: Data Sets for Robot Learning, Deep Learning Methods, Computer Vision for Manufacturing
Abstract: Popular industrial robotic problems such as spray painting and welding require (i) conditioning on free-shape 3D objects and (ii) planning of multiple trajectories to solve the task. Yet, existing solutions make strong assumptions on the form of input surfaces and the nature of output paths, resulting in limited approaches unable to cope with real-data variability. By leveraging on recent advances in 3D deep learning, we introduce a novel framework capable of dealing with arbitrary 3D surfaces, and handling a variable number of unordered output paths (i.e. unstructured). Our approach predicts local path segments, which can be later concatenated to reconstruct long-horizon paths. We extensively validate the proposed method in the context of robotic spray painting by releasing PaintNet, the first public dataset of expert demonstrations on free-shape 3D objects collected in a real industrial scenario. A thorough experimental analysis demonstrates the capabilities of our model to promptly predict smooth output paths that cover up to 95% of previously unseen object surfaces, even without explicitly optimizing for paint coverage.
|
| |
| 15:30-17:00, Paper MoBIP-19.3 | Add to My Program |
| Switching Head-Tail Funnel UNITER for Dual Referring Expression Comprehension with Fetch-And-Carry Tasks |
|
| Korekata, Ryosuke | Keio University |
| Kambara, Motonari | Keio University |
| Yoshida, Yu | Keio University |
| Ishikawa, Shintaro | Keio University |
| Kawasaki, Yosuke | Keio University |
| Takahashi, Masaki | Keio University |
| Sugiura, Komei | Keio University |
Keywords: Deep Learning for Visual Perception, Deep Learning Methods, AI-Enabled Robotics
Abstract: This paper describes a domestic service robot (DSR) that fetches everyday objects and carries them to specified destinations according to free-form natural language instructions. Given an instruction such as �Move the bottle on the left side of the plate to the empty chair,� the DSR is expected to identify the bottle and the chair from multiple candidates in the environment and carry the target object to the destination. Most of the existing multimodal language understanding methods are impractical in terms of computational complexity because they require inferences for all combinations of target object candidates and destination candidates. We propose Switching Head�Tail Funnel UNITER, which solves the task by predicting the target object and the destination individually using a single model. Our method is validated on a dataset based on a standard dataset for Vision-and-Language Navigation with object manipulation tasks. The results show that our method outperforms the baseline method in terms of language comprehension accuracy. Furthermore, we conduct physical experiments in which a DSR delivers standardized everyday objects in a standardized domestic environment as requested by instructions with referring expressions. The experimental results show that the object grasping and placing actions are achieved with success rates of more than 90%.
|
| |
| 15:30-17:00, Paper MoBIP-19.4 | Add to My Program |
| FeatDANet: Feature-Level Domain Adaptation Network for Semantic Segmentation |
|
| Li, Jiao | Shanghai Institute of Microsystem and Information Technology |
| Shi, Wenjun | Shanghai Institute of Microsystem and Information Technology |
| Zhu, Dongchen | Shanghai Institute of Microsystem and Information Technology, Chi |
| Zhang, Guanghui | Shanghai Institute of Microsystem and Information Technology, Ch |
| Zhang, Xiaolin | Shanghai Institute of Microsystem and Information Technology, Chi |
| Li, Jiamao | Shanghai Institute of Microsystem and Information Technology, Chi |
Keywords: Transfer Learning, Object Detection, Segmentation and Categorization, Deep Learning Methods
Abstract: Unsupervised domain adaptation(UDA) is proposed to better adapt the network trained on labeled synthetic data to unlabeled real-world data for addressing the annotation cost. However, most of these methods pay more attention to domain distributions in input and output stages while ignoring the important differences in semantic expressions and local details in middle feature stages. Therefore, a novel UDA network named FeatDANet is presented to align feature-level domain distributions at each encoder layer. Specifically, two attention-based modules abbreviated as IFAM and DFLM are designed and implemented by mixing queries and keys between domains for advisable domain adaptation. The former realizes Inter-domain Features Alignment by transferring feature style, and the latter achieves Domain-invariant Features Learning robustly for the domain shift. Furthermore, FeatDANet is constructed as a self-training network with three weight-sharing branches, and an improved pseudo-labels learning strategy is suggested by identifying more confident pseudo-labels and maximizing the use of pseudo-labels. It increases the participation of unlabeled data and also ensures stability in training. Extensive experiments show that FeatDANet achieves state-of-the-art performances on the tasks of GTA-to-Cityscapes and Synthia-to-Cityscapes.
|
| |
| 15:30-17:00, Paper MoBIP-19.5 | Add to My Program |
| BlinkFlow: A Dataset to Push the Limits of Event-Based Optical Flow Estimation |
|
| Li, Yijin | Zhejiang University |
| Huang, Zhaoyang | The Chinese University of Hong Kong |
| Chen, Shuo | Zhejiang University |
| Shi, Xiaoyu | The Chinese University of Hong Kong |
| Li, Hongsheng | Chinese University of Hong Kong |
| Bao, Hujun | Zhejiang University |
| Cui, Zhaopeng | Zhejiang University |
| Zhang, Guofeng | Zhejiang University |
Keywords: Deep Learning for Visual Perception, Deep Learning Methods
Abstract: Event cameras provide high temporal precision, low data rates, and high dynamic range visual perception, which are well-suited for optical flow estimation. While data-driven optical flow estimation has obtained great success in RGB cameras, its generalization performance is seriously hindered in event cameras mainly due to the limited and biased training data. In this paper, we present a novel simulator, BlinkSim, for the fast generation of large-scale data for event-based optical flow. BlinkSim incorporates a configurable rendering engine alongside an event simulation suite. By leveraging the wealth of current 3D assets, the rendering engine enables us to automatically build up thousands of scenes with different objects, textures, and motion patterns and render very high-frequency images for realistic event data simulation. Based on BlinkSim, we construct a large training dataset and evaluation benchmark BlinkFlow that contains sufficient, diversiform, and challenging event data with optical flow ground truth. Experiments show that BlinkFlow improves the generalization performance of state-of-the-art methods by more than 40% on average and up to 90%. Moreover, we further propose an Event-based optical Flow transFormer (E-FlowFormer) architecture. Powered by our BlinkFlow, E-FlowFormer outperforms the SOTA methods by up to 91% on the MVSEC dataset and 14% on the DSEC dataset and presents the best generalization performance. The source code and data are available at https://zju3dv.github.io/blinkflow/.
|
| |
| 15:30-17:00, Paper MoBIP-19.6 | Add to My Program |
| Discovering Symbolic Adaptation Algorithms from Scratch |
|
| Kelly, Stephen | McMaster University |
| Park, Daniel | Google |
| Song, Xingyou | Google Brain |
| McIntire, Mitchell | Google |
| Nashikkar, Pranav | Google |
| Guha, Ritam | Michigan State University |
| Banzhaf, Wolfgang | Michigan State University |
| Deb, Kalyanmoy | Michigan State |
| Boddeti, Vishnu | Michigan State University |
| Tan, Jie | Google |
| Real, Esteban | Google |
Keywords: Evolutionary Robotics, Optimization and Optimal Control, Deep Learning Methods
Abstract: Autonomous robots deployed in the real world will need control policies that rapidly adapt to environmental changes. To this end, we propose AutoRobotics-Zero (ARZ), a method based on AutoML-Zero that discovers zero-shot adaptable policies from scratch. In contrast to neural network adaption policies, where only model parameters are optimized, ARZ can build control algorithms with the full expressive power of a linear register machine. We evolve modular policies that tune their model parameters and alter their inference algorithm on-the-fly to adapt to sudden environmental changes. We demonstrate our method on a realistic simulated quadruped robot, for which we evolve safe control policies that avoid falling when individual limbs suddenly break. This is a challenging task in which two popular neural network baselines fail. To evolve safe control policies that avoid falling, we leverage multi-objective search to simultaneously optimize forward motion gaits and stability. Finally, we conduct a detailed analysis of our method on a novel and challenging non-stationary control task dubbed Cataclysmic Cartpole. Results confirm our findings that ARZ is significantly more robust to sudden environmental changes and can build simple, interpretable control policies.
|
| |
| 15:30-17:00, Paper MoBIP-19.7 | Add to My Program |
| Visual Pre-Training for Navigation: What Can We Learn from Noise? |
|
| Wang, Yanwei | MIT |
| Ko, Ching-Yun | MIT |
| Agrawal, Pulkit | MIT |
Keywords: Representation Learning, Vision-Based Navigation, Deep Learning for Visual Perception
Abstract: One powerful paradigm in visual navigation is to predict actions from observations directly. Training such an end-to-end system allows representations useful for downstream tasks to emerge automatically. However, the lack of inductive bias makes this system data inefficient. We hypothesize a sufficient representation of the current view and the goal view for a navigation policy can be learned by predicting the location and size of a crop of the current view that corresponds to the goal. We further show that training such random crop prediction in a self-supervised fashion purely on synthetic noise images transfers well to natural home images. The learned representation can then be bootstrapped to learn a navigation policy efficiently with little interaction data.
|
| |
| 15:30-17:00, Paper MoBIP-19.8 | Add to My Program |
| Spatio-Temporal Attention Network for Persistent Monitoring of Multiple Mobile Targets |
|
| Wang, Yizhuo | National University of Singapore |
| Wang, Yutong | National University of Singapore |
| Cao, Yuhong | National University of Singapore |
| Sartoretti, Guillaume Adrien | National University of Singapore (NUS) |
Keywords: Deep Learning Methods, Motion and Path Planning, Surveillance Robotic Systems
Abstract: This work focuses on the persistent monitoring problem, where a set of targets moving based on an unknown model must be monitored by an autonomous mobile robot with a limited sensing range. To keep each target's position estimate as accurate as possible, the robot needs to adaptively plan its path to (re-)visit all the targets and update its belief from measurements collected along the way. In doing so, the main challenge is to strike a balance between exploitation, i.e., re-visiting previously-located targets, and exploration, i.e., finding new targets or re-acquiring lost ones. Encouraged by recent advances in deep reinforcement learning, we introduce an attention-based neural solution to the persistent monitoring problem, where the agent can learn the inter-dependencies between targets, i.e., their spatial and temporal correlations, conditioned on past measurements. This endows the agent with the ability to determine which target, time, and location to attend to across multiple scales, which we show also helps relax the usual limitations of a finite target set with prior positional information. We experimentally demonstrate that our method outperforms other baselines in terms of number of targets visits and average estimation error in complex environments. Finally, we implement and validate our model in a drone-based simulation experiment to monitor mobile ground targets in a high-fidelity simulator.
|
| |
| 15:30-17:00, Paper MoBIP-19.9 | Add to My Program |
| Subtask Aware End-To-End Learning for Visual Room Rearrangement |
|
| Kim, Youngho | KAIST (Korea Advanced Institute of Science and Technology) |
| Kim, Jong-Hwan | KAIST |
Keywords: Deep Learning Methods, Perception-Action Coupling, Long term Interaction
Abstract: The goal of intelligent embodied agents is to learn how to explore within the environment, interact with objects, and understand the environment in order to achieve task objectives. There are two main approaches to training such agents: one is to train an action policy that performs the task goal through end-to-end learning, and the other is to construct a policy by implementing the necessary abilities according to the task goal in a modular manner. For complex and long-horizon tasks, such as visual room rearrangement, a modular approach that infers task sequence by identifying the causality of actions through prior knowledge shows higher performance. Based on this insight, we propose an Online Subtask Prediction Network (OSPNet) that determines the subtask to be performed at each moment based on the environment information and past subtask inference history to train an embodied agent for long-horizon tasks through an end-to-end manner, and also propose a Subtask Aware Policy Network (SAPNet) as the action policy that decides actions based on the reasoning of the OSPNet. We implement an embodied agent that performs visual room rearrangement using the proposed SAPNet and train it through imitation learning, demonstrating similar or better performance with much fewer training steps than previous works.
|
| |
| 15:30-17:00, Paper MoBIP-19.10 | Add to My Program |
| Disentangling Crowds Interactions for Pedestrians Trajectory Prediction |
|
| Bhujel, Niraj | A*STAR |
| Yau, Wei-Yun | I2R |
Keywords: Deep Learning Methods, Human-Aware Motion Planning, Probabilistic Inference
Abstract: Predicting the future actions of multiple pedestrians is an essential feature for autonomous robots co-working in human crowded-environments. Estimating the unknown future path is a challenging problem due to the complex interactions occurring among pedestrians. Although recent developments in Graph Convolutional Network (GCN) allow for efficient encoding of such complex interactions, the encoded representations still lack the informative factors necessary to accurately predict their future behavior. To solve this, we introduce Disentangled GCN (DGCN) which aims to better capture the crowd interactions by decoupling the spatial and temporal factors. More specifically, we propose to encode the crowd interactions with two low-dimensional latent spaces: textit{spatial} latent and textit{temporal} latent, and decode the pedestrian's future behavior using the learned latents. We propose a novel regularizer function to train these latents in an unsupervised manner and condition the trajectory prediction on the learned latents using a spatially aware graph decoder. The proposed method is evaluated extensively on publicly available datasets consisting of pedestrians and vehicles. Our method improves mADE on ETH/UCY pedestrians dataset and achieves new state-of-the-art mFDE on nuScenes vehicle datasets.
|
| |
| 15:30-17:00, Paper MoBIP-19.11 | Add to My Program |
| EAAINet: An Element-Wise Attention Network with Global Affinity Information for Accurate Indoor Visual Localization |
|
| Dai, Kun | HIT |
| Xie, Tao | Harbin Institute of Technology |
| Wang, Ke | Harbin Institute of Technology |
| Jiang, Zhiqiang | Harbin Institute of Technology |
| Liu, Dedong | Harbin Institute of Technology |
| Li, Ruifeng | Harbin Institute of Technology |
| Wang, Jiahe | Harbin Institute of Technology |
Keywords: Deep Learning Methods, Transfer Learning, Deep Learning for Visual Perception
Abstract: Visual localization, a vital component of many visual applications, has been tackled by scene coordinates regression (SCoRe) methods that leverage neural networks to predict scene coordinates, followed by a PnP algorithm to recover camera pose. Nevertheless, these methods do not consider the relationship between image patches, known as relative features or affinity information, which can improve the capability of the network to perform complete scene parsing. Additionally, owing to the visual similarity between image patches, these methods are incapable of extracting reliable absolute features, resulting in inferior performance. In response, we propose EAAINet that is based on classical SCoRe-based approaches and consists of two novel modules: the Global Affinity Aggregation Module (GAAM) and the Element-wise Attention Module (EAM). Specifically, GAAM employs an interval sampling strategy to sample image patches to construct sparse graph neural networks (GNN), from which global affinity information between image patches is retrieved, hence ensuring precise scene parsing. EAM integrates multi-level features to generate reliable absolute features to regress accurate scene coordinates, with the key insight that the structure information is essential to differentiate similar image patches and the semantic information assists in modeling regression problems. Technically, EAM predicts element-wise soft attention masks to reconcile multi-level feature maps, enabling efficient feature fusion. Positional encoding and uncertainty modeling are also employed to enhance visual localization performance. Our proposed GAAM and EAM are designed as generic modules that can be assembled into modern SCoRe-based networks to boost performance. Experimental results show
|
| |
| 15:30-17:00, Paper MoBIP-19.12 | Add to My Program |
| Transformer-Based Neural Augmentation of Robot Simulation Representations |
|
| Serifi, Agon | ETH Zurich |
| Knoop, Espen | The Walt Disney Company |
| Schumacher, Christian | Disney Research |
| Kumar, Naveen | The Walt Disney Company |
| Gross, Markus | ETH Zurich |
| B�cher, Moritz | Disney Research |
Keywords: Deep Learning Methods, Simulation and Animation, Machine Learning for Robot Control
Abstract: Simulation representations of robots have advanced in recent years. Yet, there remain significant sim-to-real gaps because of modeling assumptions and hard-to-model behaviors such as friction. In this letter, we propose to augment common simulation representations with a transformer-inspired architecture, by training a network to predict the true state of robot building blocks given their simulation state. Because we augment building blocks, rather than the full simulation state, we make our approach modular which improves generalizability and robustness. We use our neural network to augment the state of robot actuators, and also of rigid body states. Our actuator augmentation generalizes well across robots, and our rigid body augmentation results in improvements even under high uncertainty in model parameters.
|
| |
| MoBIP-20 Late breaking, Hall E |
Add to My Program |
| Late Breaking Posters II |
|
| |
| |
| 15:30-17:00, Paper MoBIP-20.1 | Add to My Program |
| Towards Robust 3D Robot Perception in Urban Environments: The UT Campus Object Dataset |
|
| Zhang, Arthur | University of Texas at Austin |
| Eranki, Chaitanya | University of Texas at Austin |
| Zhang, Christina | University of Texas at Austin |
| Hong, Raymond | University of Texas at Austin |
| Kalyani, Pranav | University of Texas at Austin |
| Kalyanaraman, Lochana | University of Texas at Austin |
| Gamare, Arsh | University of Texas at Austin |
| Esteva, Maria | University of Texas at Austin |
| Biswas, Joydeep | University of Texas at Austin |
Keywords: Data Sets for Robotic Vision, Object Detection, Segmentation and Categorization, Visual Learning
Abstract: We introduce the UT Campus Object Dataset (CODa), a mobile robot egocentric perception dataset collected on the University of Texas Austin Campus. Our dataset contains 8.5 hours of multimodal sensor data: synchronized 3D point clouds and stereo RGB video from a 128-channel 3D LiDAR and two 1.25MP RGB cameras at 10 fps; RGB-D videos from an additional 0.5MP sensor at 7 fps, and a 9-DOF IMU sensor at 40 Hz. We provide 58 minutes of ground-truth annotations containing 1.3 million 3D bounding boxes with instance IDs for 53 semantic classes, 5000 frames of 3D semantic annotations for urban terrain, and pseudo-ground truth localization. We repeatedly traverse identical geographic locations for a wide range of indoor and outdoor areas, weather conditions, and times of the day. Using CODa, we empirically demonstrate that: 1) 3D object detection performance in urban robotics settings is significantly higher when trained using CODa compared to existing datasets even when employing state-of-the-art domain adaptation approaches, 2) pre-training on CODa followed by fine-tuning leads to improved performance on autonomous vehicle (AV) datasets, and 3) sensor-specific fine-tuning improves 3D object detection accuracy. Using our dataset and annotations, we release benchmarks for 3D object detection and 3D semantic segmentation using established metrics. In the future, the CODa benchmark will include additional tasks like unsupervised object discovery and re-identification. We publicly release CODa on the Texas Data Repository: https://dataverse.tdl.org/dataset.xhtml?persistentId=doi:10.18738/T8/BBOQMV, pre-trained models: https://github.com/ut-amrl/utcoda-models, and CODa development package: https://github.com/ut-amrl/coda-devkit. We expect CODa to be a valuable dataset for research in egocentric 3D perception and planning for autonomous navigation in urban environments.
|
| |
| 15:30-17:00, Paper MoBIP-20.2 | Add to My Program |
| AnyLoc: Towards Universal Visual Place Recognition |
|
| Keetha, Nikhil Varma | Carnegie Mellon University |
| Mishra, Avneesh | International Institute of Information Technology, Hyderabad |
| Karhade, Jay | Carnegie Mellon University |
| Jatavallabhula, Krishna Murthy | MIT |
| Scherer, Sebastian | Carnegie Mellon University |
| Krishna, Madhava | IIIT Hyderabad |
| Garg, Sourav | University of Adelaide |
Keywords: Deep Learning for Visual Perception, SLAM, Recognition
Abstract: Visual Place Recognition (VPR) is vital for robot localization. To date, the most performant VPR approaches are environment- and task-specific: while they exhibit strong performance in structured environments (predominantly urban driving), their performance degrades severely in unstructured environments, rendering most approaches brittle to robust real-world deployment. In this work, we develop a universal solution to VPR -- a technique that works across a broad range of structured and unstructured environments (urban, outdoors, indoors, aerial, underwater, and subterranean environments) without any re-training or fine-tuning. We demonstrate that general-purpose feature representations derived from off-the-shelf self-supervised models with no VPR-specific training are the right substrate upon which to build such a universal VPR solution. Combining these derived features with unsupervised feature aggregation enables our suite of methods, AnyLoc, to achieve up to 4X significantly higher performance than existing approaches. We further obtain a 6% improvement in performance by characterizing the semantic properties of these features, uncovering unique domains which encapsulate datasets from similar environments. Our detailed experiments and analysis lay a foundation for building VPR solutions that may be deployed anywhere, anytime, and across anyview. We encourage the readers to explore our project page and interactive demos: https://anyloc.github.io/.
|
| |
| 15:30-17:00, Paper MoBIP-20.3 | Add to My Program |
| Design and Implementation of a User-Controlled Obstacle Avoiding Robot Using ROS |
|
| Washum, Joseph | Hendrix College |
Keywords: Telerobotics and Teleoperation
Abstract: Building upon previous classroom work using Q-learning, we plan to integrate the use of an Xbox controller with Q-learning to help the robot learn to follower user instruction while avoiding obstacles. Specifically, we look to adapt previous work used with EV3 Lego Mindstorm robots to a IRobot Create3 robot using ROS 2.0. The robot will conduct movement based off controller input and change its direction if the user decides that the robot made an incorrect move, thus storing the correct and incorrect movements as Q-values. Ultimately, this project looks to implement and improve upon these foundational topics and use them to create an autonomous obstacle avoiding robot.
|
| |
| 15:30-17:00, Paper MoBIP-20.4 | Add to My Program |
| Stable Dishware Pushing Via Convolutional Neural Networks |
|
| Hong, Youngjin | Sungkyunkwan University |
| Jung, Hong-ryul | Sungkyunkwan University |
| Seo, Sungwon | SungKyunKwan University |
| Jeon, Jeongmin | Sungkyunkwan University |
| Kim, Jonghyun | Sungkyunkwan University |
| Moon, Hyungpil | Sungkyunkwan University |
Keywords: Deep Learning in Grasping and Manipulation, AI-Based Methods, Task and Motion Planning
Abstract: Pushing objects is a valuable technique for manipulating large or unwieldy objects when gripping them is not feasible. This is particularly applicable in tasks like dish clearance, where pushing wider dishes instead of grasping them is a practical choice for relocation. However, the challenge lies in performing analytical push planning without knowledge of important physical characteristics such as friction coefficient and center of friction. To overcome this challenge, we propose a supervised learning approach for stable planar pushing of dishware with unknown physical properties. The approach utilizes convolutional neural networks (CNN) to evaluate the probability of a successful push based on the depth image of the object and a planar pushing direction. The model�s output, combined with the Hybrid A* algorithm, enables the planning of a push path that ensures stable relocation to a desired location. Training data is sampled from various domains, including dish object meshes, friction coefficients, and dish poses, to ensure robust performance. Experimental results show that the trained model and path planning algorithm robustly handle uncertainties in object properties, with average relocation success rate of 82%.
|
| |
| 15:30-17:00, Paper MoBIP-20.5 | Add to My Program |
| Depth Camera Video-Based Reservoir Computing for Accurate Classification of American Sign Language |
|
| Thongking, Witchuda | Shibaura Institue of Technology |
| Wiranata, Ardi | Shibaura Institute of Technology |
| Maeda, Shingo | Tokyo Institute of Technology |
| Premachandra, Chinthaka | Shibaura Institute of Technology |
Keywords: Deep Learning Methods, Humanoid and Bipedal Locomotion, Optimization and Optimal Control
Abstract: Depth cameras have emerged as successful devices for facilitating interaction between humans and machines. Their advantages, including lightweight design, durability, high image quality, high sensitivity, privacy, and versatility, have motivated researchers to explore their application in advanced domains. However, developing a comprehensive method that can capture human motion with enhanced detail and is applicable to a wide range of scenarios remains a significant challenge. In this study, we present a novel approach for American Sign Language (ASL) classification using depth camera videos which capture the spatial dimensions of hand movements during the signing process, a preprocessing process that transform the ordinary video of the depth camera into meaningful signals, and reservoir computing for calssification. ASL plays a crucial role in facilitating communication for individuals with hearing impairments, and accurate classification of ASL gestures is essential for effective translation and interpretation.
|
| |
| 15:30-17:00, Paper MoBIP-20.6 | Add to My Program |
| Assistive Agile Robot for Non-Visual Navigation |
|
| Hata, Rayna | Carnegie Mellon University |
| Doore, Stacy A. | Colby College |
Keywords: Design and Human Factors, Human-Robot Teaming, Physical Human-Robot Interaction
Abstract: Problem Statement Guide dogs are a trusted form of assistive navigation for individuals with blindness or low vision (BLV). Studies of handler-dog pairs have found that guide dogs provide companionship and emotional support to their BLV handlers [1]. However, guide dogs have a limited working lifespan, high training and care costs, and require a profound commitment to maintain the well-being of multiple dogs over a BLV handler�s lifespan. The objective of this study is to adapt a commercial quadruped robot as a non-visual navigation tool. There are several research teams working in this space with small quadruped robots to navigate through simple indoor spaces[2][3]. The smaller robots require longer handles which alters typical handler positioning resulting in limited body communication between the human-robot pair. Larger robots reduce this issue by having a similar size to a typical guide dog. Main Findings Our team partnered with an expert guide dog handler to supervise the co-design process. We observed human-animal pair communication methods, feedback system, behaviors, and shared responsibilities. We developed a voice-based app to give the robot direction and orientation commands. The embedded cameras and proximity sensor placements required a modified handle to be placed on the rear half of the robot. The handle design allowed for 120 degree movement, providing the handler greater freedom of movement in the wrist. This study was conducted to evaluate the functionality of the handle and test a set of human-robot interaction and navigation tasks: 1) give 4 commands to the robot via the voice interface app, 2) use the handle to walk 100 ft., and 3) navigate a short flight of stairs. Participants completed a post-study survey about the interactions, the handle prototype, and perceptions of trust in robot navigation. Our co-designer reported that the handle worked well, was comfortable, and allowed them to simulate the same walking, turning, and stair movements as with their guide dog. Sighted participants reported the handle was easy to hold, moved smoothly, and felt stable. The preliminary findings suggest that industrial agile robots are able to replicate many of the basic tasks that guide dogs perform. Conclusion Early testing suggests that a large industrial robot has many of the essential features, the preferred sizing, and sophisticated sensing abilities that are required of a novice guide dog. Future work will include a new handle design, expanded voice-based interactions, and object detection of common navigation features. REFERENCES [1] L. Whitmarsh, The benefits of guide dog ownership, Visual Impairment Research, 7, (1), p. 27�42, Jan 2005. [2] A. Xiao et al, Robotic guide dog: Leading a human with leash-guided hybrid physical interaction, Jun21, arXiv:2103.14300. [3] H. Hwang et al, System configuration and navigation of a guide dog robot: Toward animal guide dog-level guiding work, Oct2022, http://arxiv.org/abs/ 2210.13368.
|
| |
| 15:30-17:00, Paper MoBIP-20.7 | Add to My Program |
| Enhancing Dexterity in Robotic Manipulation Via Hierarchical Contact Exploration |
|
| Cheng, Xianyi | Carnegie Mellon University |
| Patil, Sarvesh | Carnegie Mellon University School of Computer Science |
| Temel, Zeynep | Carnegie Mellon University |
| Kroemer, Oliver | Carnegie Mellon University |
| Mason, Matthew T. | Carnegie Mellon University |
Keywords: Dexterous Manipulation, Manipulation Planning, In-Hand Manipulation
Abstract: We present a hierarchical planning framework for dexterous robotic manipulation (HiDex). This framework exploits in-hand and extrinsic dexterity by actively exploring contacts. It generates rigid-body motions and complex contact sequences. Our framework is based on Monte-Carlo Tree Search (MCTS) and an overview can be found in Figure 1. This framework offers two main advantages. First, it allows efficient global reasoning over high-dimensional complex space created by contacts. It solves a diverse set of manipulation tasks that require dexterity. Second, our framework allows incorporation of expert knowledge through MCTS and customizable setups in task mechanics and models. Hence, it could provide a generalizable solution for various manipulation tasks. In our code, setting up new scenarios and adjusting search parameters only require modification of one setup.yaml file. We instantiate this framework on manipulation with extrinsic dexterity and in-hand manipulation. Example tasks include pick up a card, book-out-of-bookshelf, peg-out-of-hole, block flipping, occluded grasp, upward peg-in-hole, sideway peg-in-hole, planar reorientation, planar block passing, and in-hand reorientation. We also demonstrate some of them on two robot platforms. As future work, we envision this framework to be extended towards general manipulation planning that incorporates global reasoning, mechanics, learning, and optimization.
|
| |
| 15:30-17:00, Paper MoBIP-20.8 | Add to My Program |
| Functional Grasping of Tools Using Approach Heatmaps |
|
| Aburub, Malek | Osaka University |
| Higashi, Kazuki | Osaka University |
| Wan, Weiwei | Osaka University |
| Harada, Kensuke | Osaka University |
Keywords: Dexterous Manipulation, Multifingered Hands, Grasping
Abstract: This work presents a framework that enables utilizing tools using highly dexterous end effectors. By focusing on the functional part of the robotic end-effector and the object, an approach heatmap is created for the grasp planner to use for functional grasping of the tool. This is done without the need for human demonstration, enabling the robotic gripper to utilize different tools and adapt to the variations in their design.
|
| |
| 15:30-17:00, Paper MoBIP-20.9 | Add to My Program |
| Data-Driven Distributionally Robust Mitigation of Risk of Cascading Failures |
|
| Liu, Guangyi | Lehigh University |
| Motee, Nader | Lehigh Universitty |
Keywords: Distributed Robot Systems, Robust/Adaptive Control, Optimization and Optimal Control
Abstract: We present a novel data-driven methodology to address the risk of cascading failures in delayed discrete-time Linear Time-Invariant (LTI) systems. Real-world scenarios often involve uncertainties in the distribution of noise, which can further exacerbate the risk of systemic failure and increase the difficulty of employing effective control policies to mitigate it. To tackle this challenge, we propose a methodology that formulates a distributionally robust finite-horizon optimal control problem. The objective is to minimize a piecewise affine cost function while ensuring the Average Value-at-Risk of a set of constraints on state and input variables. These constraints are crucial in the presence of input time-delay and initial close-to-failure situations, which can introduce vulnerability to the system, i.e., cascading failures. Despite the inherent difficulty of the optimal control problem, our approach effectively achieves the desired system performance. Additionally, it enforces a set of constraints that ensure the AVAR of cascading failures remains manageable over a finite time horizon. The implications of our findings are significant for the design and implementation of control systems that are susceptible to cascading failures.
|
| |
| 15:30-17:00, Paper MoBIP-20.10 | Add to My Program |
| Breaking Symmetries Leads to Diverse Quadrupedal Gaits |
|
| Ding, Jiayu | Syracuse University |
| Sanyal, Amit | Syracuse University |
| Gan, Zhenyu | Syracuse University |
Keywords: Dynamics, Legged Robots
Abstract: Legged locomotion has drawn considerable interest in the past few years. It has been shown that symmetry in motions plays an important role in gait pattern selection and controller design. However, due to the difference in research preference, different definitions have been given for gaits and their symmetries. In our work, we proposed a general approach using group theory to uniquely define gaits and symmetries. Without loss of generality, we applied spring loaded inverted pendulum model with tunable parameters as a representative to search solutions for all asymmetrical gaits. Our results showed that asymmetrical gaits are inner connected through the process of numerical bifurcation and symmetry breaking. The results of this work can provide insights into the relationships between gaits. Furthermore, it can be proven that a limited number of gaits can be identified while the number of symmetries in locomotion is finite. Finally, our results can help design controllers based on the analysis of the symmetries of different gaits.
|
| |
| 15:30-17:00, Paper MoBIP-20.11 | Add to My Program |
| A Numerical Integrator for Forward Dynamics Simulations of Folding Process for Protein Molecules Modeled As Hyper-Redundant Robots |
|
| Kacem, Amal | University of Michigan Dearborn |
| Zbiss, Khalil | University of Michigan - Dearborn |
| Mohammadi, Alireza | University of Michigan, Dearborn |
Keywords: Dynamics, Modeling, Control, and Learning for Soft Robots
Abstract: This paper investigates development of an efficient numerical integrator for forward dynamics simulation of the protein folding process, where protein molecules are modeled as robotic mechanisms consisting of rigid nano-linkages with many degrees-of-freedom. To address the computational burden associated with fixed step-size explicit Euler methods, we develop a fast numerical scheme with an adaptive step-size strategy for computing the folding pathway of protein molecules.
|
| |
| 15:30-17:00, Paper MoBIP-20.12 | Add to My Program |
| Validation of an Algorithm for the Estimation of Human Wrist Stiffness |
|
| Giovannetti, Giorgia | Newcastle University |
| Buscaglione, Silvia | Universit� Campus Bio-Medico Di Roma |
| Noccaro, Alessia | Universit� Campus Bio-Medico Di Roma |
| Formica, Domenico | Newcastle University |
Keywords: Dynamics, Neurorobotics, Wearable Robotics
Abstract: Estimating the impedance of human joints is crucial to study neuro-motor mechanisms and assess related disorders. The aim of this study is to validate an algorithm for the estimation of passive wrist stiffness, i.e. the static passive component of the impedance. The algorithm is a geometrical framework based on a subject-specific kinematic model and was previously validated on a hardware mockup. Here we tested it on three healthy volunteers through manual perturbations. Results obtained from experiments are promising and set the basis for future validation using a robotic platform to provide wrist movements.
|
| |
| 15:30-17:00, Paper MoBIP-20.13 | Add to My Program |
| ROS 2.0 in the Classroom |
|
| Khounborine, Isaac | Hendrix College |
Keywords: Education Robotics
Abstract: This project seeks to adapt robotics projects made for different and older robots to work within the framework of ROS 2.0 (Robot Operating System 2.0) and an iRobot Create 3 Robot. With the continued evolution of robotics and robots, it should be reasoned that courses that teach robotics should evolve alongside it. The project is designed to test how to create and adapt robotics projects using ROS 2.0 that would work in a classroom environment.
|
| |
| 15:30-17:00, Paper MoBIP-20.14 | Add to My Program |
| Assessing the Internal Odometry Systems in the iRobot Create 3 |
|
| Jackson, Henry | Hendrix College |
Keywords: Education Robotics, Autonomous Vehicle Navigation
Abstract: This research focuses on assessing the precision and accuracy of the odometry systems in an iRobot Create 3 robot. Odometry is what allows the robot to predict its position based on data from its motion sensors. In the case of a Create 3 robot, the robot�s position is calculated using an optical odometry sensor, wheel encoders, and an inertial measurement unit, or IMU. This research tests whether the results are accurate to the initial goal measurements.
|
| |
| 15:30-17:00, Paper MoBIP-20.15 | Add to My Program |
| Air Pollution Modeling Via Mobile Sensor Networks and State Estimation |
|
| Nagata, Cole | Harvey Mudd College |
| Shia, Victor | Harvey Mudd College |
Keywords: Environment Monitoring and Management, Path Planning for Multiple Mobile Robots or Agents, Sensor Networks
Abstract: Fine and ultrafine particles, a category of aerosol pollution, is linked to various health issues; however, mapping them over a large area requires typically requires a large network of fixed sensors, a method that is inefficient, expensive, and in some cases impractical. This paper proposes a novel method of monitoring and mapping particulate matter concentrations over a large area by using state estimation techniques in conjunction with a network of mobile sensors to accurately map particle pollution over a large area. In validating said method, this paper finds the proposed method to be not only viable, but also can be fairly accurate depending on various factors.
|
| |
| 15:30-17:00, Paper MoBIP-20.16 | Add to My Program |
| Fall Detection of a Planar Four-Link Bipedal Robot |
|
| Mungai, M. Eva | University of Michigan |
| Grizzle, J.W | University of Michigan |
Keywords: Failure Detection and Recovery, Humanoid Robot Systems, Robot Safety
Abstract: We aim to detect falls in a planar four-link bipedal robot during closed-loop standing tasks, considering constraints on false positive and negative rates, while maximizing the lead time (i.e., time between fall declaration and the robot entering an unrecoverable state). The standing task is important in its own right and will prepare us for more complex motions. A key challenge lies in the crowding phenomenon of incipient faults (e.g., they resemble normal states) and the masking effects of the controller's corrective actions.
|
| |
| 15:30-17:00, Paper MoBIP-20.17 | Add to My Program |
| Robotic Quantification of Soil Organic Carbon for Mitigating Climate Change |
|
| Aziz, Faiza | University of Illinois Urbana-Champaign |
| Fang, Ming | University of Illinois Urbana Champaign |
| Uppalapati, Naveen Kumar | University of Illinois at Urbana-Champaign |
| Di Fulvio, Angela | University of Illinois at Urbana-Champaign |
| Chowdhary, Girish | University of Illinois at Urbana Champaign |
Keywords: Field Robots, Robotics and Automation in Agriculture and Forestry, Environment Monitoring and Management
Abstract: In the terrestrial biosphere, soil is the largest reservoir of carbon with 1,700 gigatons carbon in the top one meter. Even small changes in the Soil Organic Carbon (SOC) stocks can contribute tremendously to the reduction of greenhouse gases. Therefore, SOC sequestration has a huge potential for mitigating climate change. Furthermore, carbon sequestration indirectly improves the overall soil quality, leading to higher crop yields, and safeguarding global food security [1]. This work reports the preliminary development of a state-of-the art robotic system that can quantify the carbon stocks down to a depth of 2 meters. The existing experimental methods for SOC prediction have drawbacks like low spatial resolution, long measurement times, and requirement of soil pretreatment [2]. To address these issues, we propose a robotic system employing a non-invasive, in-situ Inelastic Neutron Scattering (INS) method for SOC quantification. INS is a novel analytical technique for in-situ soil assay which employs the unique gamma signature from carbon for SOC detection and localization [3]. We present the simulated results of the soil assay using the Monte Carlo N-Particle code [4]. Our proposed method is innovative as it integrates the INS interrogating system with a heavy payload robot for fast, continuous, and autonomous scanning of the soil [5], [6]. Moreover, it investigates the feasibility of SOC assessment with a traditional Deuterium-Tritium neutron generator and compares with the more complex associated particle detection which enables SOC measurement with 1 wt% (weight percent) accuracy in less than 10 minutes [7]. ACKNOWLEDGMENT: This research is supported by USDA grants iCOVER (NR233A750004G066) and iFARM (2022-77038-37306). The authors thank EarthSense Inc. for support & contribution in the project. REFERENCES: [1] G. Moinet, R. Hijbeek et al., �Carbon for soils, not soils for carbon,�Global Change Biology, vol. 29, 01 2023. [2] T. Angelopoulou, A. Balafoutis et al., �From laboratory to proximal sensing spectroscopy for soil organic carbon estimation�a review,�Sustainability, vol. 12, p. 443, 01 2020. [3] A. Kavetskiy, G. Yakubova et al., �Scanning mode application of neutron-gamma analysis for soil carbon mapping,� Pedosphere, vol. 29, no. 3, pp. 334�343, 2019. [4] T. Goorley, M. James et al., �Initial mcnp6 release overview,� Nuclear technology, vol. 180, no. 3, pp. 298�315, 2012. [5] V. Higuti, A. Velasquez et al., �Under canopy light detection and ranging-based autonomous navigation,� Journal of Field Robotics, vol. 36, 12 2018. [6] M. V. Gasparino, A. N. Sivakumar et al., �Wayfast: Navigation with predictive traversability in the field,� IEEE Robotics and Automation Letter. [7] M. Ayllon Unzueta, B. Ludewigt et al., �An all-digital associated particle imaging system for the 3d determination of isotopic distributions,�Review of Scientific Instruments, vol. 92, p. 063305, 06 2021.
|
| |
| 15:30-17:00, Paper MoBIP-20.18 | Add to My Program |
| Guaranteed Force Tracking Control under Unknown Environment |
|
| Jung, Seul | Chungnam National University |
| Ryu, Ho Ju | Chungnam National University |
| Hur, Sung hoon | Chungnam National University |
Keywords: Force Control
Abstract: This paper presents the analysis of bilinear force/position control scheme for the guaranteed force tracking performance of a robot manipulator under unknown environment. Simulation studies of a force control task for a robot manipulator are conducted to verify the proposition.
|
| |
| 15:30-17:00, Paper MoBIP-20.19 | Add to My Program |
| Signal Temporal Logic-Guided Model Predictive Control for Robust Bipedal Locomotion Resilient to Runtime Terrain Perturbations |
|
| Gu, Zhaoyuan | Georgia Institute of Technology |
| Guo, Rongming | Georgia Institute of Technology |
| Yates, William | Georgia Institute of Technology |
| Boyd, Nathan | Georgia Institute of Technology |
| Chen, Sixing | Georgia Institute of Technology |
| Zhao, Ye | Georgia Institute of Technology |
Keywords: Formal Methods in Robotics and Automation, Humanoid and Bipedal Locomotion, Optimization and Optimal Control
Abstract: Bipedal robots have remarkable potential for executing agile maneuvers. However, their task-planning capabilities nowadays remain insufficient for achieving truly robust and natural locomotion. This work aims to address this issue by developing a framework that employs formal methods to guarantee the robot's stability when subjected to unexpected terrain perturbations. Specifically, we design agile and safe crossed-leg maneuvers to enhance locomotion stability under these challenging situations. Our key research question revolves around how to ensure task specification correctness and maximize the robustness of complex dynamical systems. To achieve this, we devised a task-planning synthesis method for perturbation recovery and integrated it into a motion planning and control framework The core task-planning component of the proposed framework is a model predictive controller (MPC). The MPC leverages the robustness objective from signal temporal logic (STL) to synthesize the center-of-mass and swing-leg trajectories. To enhance robust locomotion recovery, the MPC adapts the walking-step durations of the planned trajectory. Additionally, data-driven kinematic constraints are introduced to ensure safety by preventing leg self-collision. A passivity-based controller is then employed to track the centroidal and swing-leg states. This work marks the first study to incorporate formal guarantees offered by STL into the planning process for perturbed bipedal locomotion. We demonstrate the efficacy of the STL-based MPC through perturbation experiments conducted on the CAREN system, an omnidirectional perturbing platform. The bipedal robot Cassie is systematically subjected to perturbations with varied magnitude, direction, and onset timing. The MPC solves in real-time at 30 Hz, and the passivity controller operates at 2000 Hz. Maneuvers from the experiment results showcase the effectiveness of our approach in maintaining stability even in challenging situations.
|
| |
| 15:30-17:00, Paper MoBIP-20.20 | Add to My Program |
| Model-Based Tactile Regrasping with the Smart Suction Cup |
|
| Lee, Jungpyo | University of California, Berkeley |
| Lee, Sebastian | University of California Berkeley |
| Huh, Tae Myung | UC Berkeley |
| Stuart, Hannah | UC Berkeley |
Keywords: Grasping, Force and Tactile Sensing, Sensor-based Control
Abstract: Suction cups are an important gripper type in industrial robot applications. Vision-based planners can fail due to adversarial objects or lose generalizability for unseen scenarios. We propose haptic exploration to improve suction cup grasping when visual grasp planners fail. We present the Smart Suction Cup, an end-effector that utilizes internal flow measurements for tactile sensing. We show that model-based haptic search methods, guided by these flow measurements, improve grasping success by up to 2.5x as compared with using only a vision planner during a bin-picking task.
|
| |
| 15:30-17:00, Paper MoBIP-20.21 | Add to My Program |
| Enhancing the Performance of Pneu-Net Actuators Using a Torsion Resistant Strain Limiting Layer |
|
| Good, Ian | University of Washington |
| Balaji, Srivatsan | University of Washington |
| Lipton, Jeffrey | University of Washington |
Keywords: Grippers and Other End-Effectors, Soft Robot Materials and Design, Compliant Joints and Mechanisms
Abstract: A key limitation to soft fluid-based grippers adoption is their ability to grasp larger payloads due to objects slipping out of grasps. We have overcome this limitation by introducing a torsionally rigid strain limiting layer (TRL) to a standard pneu-net gripper. This reduces out of plane (OOP) bending of soft pneumatic fingers while maintaining their softness and in-plane flexibility. We characterize the design space of the TRL for a pneu-net gripper and demonstrate its out-of-plane bending performance compared to a benchmark pneu-net gripper. We found that the use of our TRL reduced OOP bending by 97.7% (σ = 0.08%) against a benchmark pneu-net gripper with a 50.5g load. In practice, we demonstrate a payload capacity of 5kg when loading using the skeleton.
|
| |
| 15:30-17:00, Paper MoBIP-20.22 | Add to My Program |
| Safe Force Feedback for Haptic Interfacing in Robot-Assisted Surgery |
|
| Mazidi, Aiden | Concordia University |
| Sayadi, Amir | McGill Universiity |
| Kazemipour, Negar | Concordia University |
| Dargahi, Javad | Concordia University |
| Barralet, Jake | McGill University |
| Hooshiar, Amir | McGill University |
Keywords: Haptics and Haptic Interfaces, Telerobotics and Teleoperation, Virtual Reality and Interfaces
Abstract: This study presents a method for enhancing haptic feedback in robot-assisted surgery, known as nonlinear impedance matching (NIMA). In contrast to previous linear impedance methods, NIMA identifies and factors in the complex changes in tool-tissue contact impedance parameters in real-time. The study demonstrated a 22% improvement in accuracy with NIMA compared to linear impedance matching, indicating that the linear method may not fully capture the dynamic changes experienced in surgical procedures. The NIMA method was tested through surrogate components and a virtual reality soft body, demonstrating its practicability. This proofofconcept study signifies the potential for NIMA's application in commercially available robotic teleoperation systems in the future.
|
| |
| 15:30-17:00, Paper MoBIP-20.23 | Add to My Program |
| Feasibility of Force Feedback on Hyper-Elastic Bodies Using Haptic Gloves |
|
| Kang, Hyeseon | Seoul National University of Science and Technology |
| Kim, Jinhyun | Seoul National University of Science and Technology |
Keywords: Haptics and Haptic Interfaces, Wearable Robotics
Abstract: The implementation of haptic feedback for hyper-elastic bodies necessitates the integration of the real-time Finite Element Method(FEM) technique, which requires the simultaneous process of parameter measurement, reaction force calculation, and haptic feedback. In this paper, we selected haptic gloves as the haptic device and demonstrated that the above process could be performed simultaneously, even when using a finger joint angle changes as a parameter. Through this approach, we have verified the feasibility of integrating haptic gloves and the real-time FEM technique for implementing haptic feedback on hyper-elastic bodies.
|
| |
| 15:30-17:00, Paper MoBIP-20.24 | Add to My Program |
| A Novel Haptic Glove with 2-DoF Force Feedback on Single Finger |
|
| Zhou, Jianfeng | Case Western Reserve University |
| Gong, Yifeng | Case Western Reserve University |
| Daltorio, Kathryn A | Case Western Reserve University |
Keywords: Haptics and Haptic Interfaces, Wearable Robotics, Physical Human-Robot Interaction
Abstract: This paper presents the design and control of an exoskeleton glove to provide kinesthetic feedback on a single finger. This haptic glove has two actuators on a single finger to generate a 2 DoF force vector on the fingertip. With this glove, the operator can determine the shape and trend of the contact surface by feeling the magnitude and direction of the force.
|
| |
| 15:30-17:00, Paper MoBIP-20.25 | Add to My Program |
| Prediction of Human Center of Mass Position from Ground Reaction Forces |
|
| Alizadeh Noghani, Mohsen | University of Notre Dame |
| Bol�var-Nieto, Edgar | University of Notre Dame |
Keywords: Human and Humanoid Motion Analysis and Synthesis, Prosthetics and Exoskeletons, Wearable Robotics
Abstract: For optimization-based control of prosthetic legs such as the Open Source Leg (OSL), knowledge about the configuration of the lower limbs in a future window of time can enable a predictive control action (e.g., model predictive control and trajectory planning). Measurement of the ground reaction forces (GRF) on robotic prosthetic legs can provide an estimate of the center of mass (CoM) acceleration and thus position, from which it is possible to infer the joint angles. Previous work has focused on estimating CoM position only during walking and the proposed methods are mainly applicable in offline post-processing. To predict the future CoM position in real-time, investigating the effect of the length of the prediction window and assumptions about the CoM kinematics are crucial; this work presents a preliminary investigation of those parameters.
|
| |
| 15:30-17:00, Paper MoBIP-20.26 | Add to My Program |
| 3D Visual Skeleton Recognition for Instantaneous Phase Identification in Sit-To-Stand Movements with a Mobile Assistive Robot in Close Proximity |
|
| Mahdi, Anas | University of Waterloo |
| Dong, Zonghao | Tohoku University |
| Lin, Jonathan Feng-Shun | University of Waterloo |
| Hirata, Yasuhisa | Tohoku University |
| Mombaur, Katja | Karlsruhe Institute of Technology |
Keywords: Human Detection and Tracking, Physically Assistive Devices, Physical Human-Robot Interaction
Abstract: The Sit-To-Stand (STS) is a fundamental movement that plays an a vital role in the daily activities of older adults. The decline in STS ability can result in functional limitations and lead to difficulty completing daily tasks independently. In this paper, we propose a vision-based method for estimating the human pose during the STS movement at close range, utilizing a depth camera integrated onto the SkyWalker robotic rollator. One of the key applications of our method is phase classification, which involves identifying the specific phase of STS that the user is in. The phase classification algorithm uses the Mediapipe model to provide a full-body skeleton structure describing the user's posture. To assess the accuracy of the proposed software, we compared the 3D skeleton tracking results with those obtained from a motion capture system Vicon. The comparison analyzed both the kinematics accuracy and similarities for the trend of changes. The classification results can be used to classify which phase of STS the user is currently in out of the 4 phases we identified when using SkyWalker. This allows the robot to adjust its actions accordingly and provide optimal support, thus preventing falls.
|
| |
| 15:30-17:00, Paper MoBIP-20.27 | Add to My Program |
| A Minimal Universal Framework for Context-Aware Collaboration |
|
| Panoff, Maximillian | University of Florida |
| Isnard, Achil | ESIREM |
| Bobda, Christophe | University of Arkansas |
Keywords: Human-Aware Motion Planning, Human-Robot Collaboration, Human-Robot Teaming
Abstract: As automation advances, humans and robots will increasingly need to work together to accomplish tasks. This is known as Human-Robot Collaboration (HRC) and is an unsolved problem in robotics. One particularly interesting approach to this is Context-Aware HRC, in which knowledge of the overall goals and environment is used by robotic systems to better support humans cite{LIU2_context_aware_pose_2021}. While many works in this field have been able to accomplish this goal to varying degrees of success, they still require textit{a priori} knowledge in the form of application-specific tasks to the point where on-the-fly programming methods are actively researched cite{elZaatari_programming_survey_2019, George_task_complexity_survey_2023}.Creating a single unified framework that supports tasks across multiple domains such as manufacturing, warehousing, and assisted living requires identification of the properties of tasks in these areas that are common across all of them. In order to solve this issue, we use a set of atomic attributes which can be combined to define Context. Additionally, we identify a Basic Assistive Goals that exist across HRC applications. Through this HRC solutions should be able to immediately provide benefits to their users, which additional fine-tuning can enhance.
|
| |
| 15:30-17:00, Paper MoBIP-20.28 | Add to My Program |
| Exploring LLM in Intention Modeling for Human-Robot Collaboration |
|
| Li, Sikai | University of Michigan |
| Peng, Run | University of Michigan, Ann Arbor |
| Dai, Yinpei | University of Michigan |
| Lee, Jenny | University of Michigan - Ann Arbor |
| Chai, Joyce | University of Michigan |
Keywords: Human-Robot Collaboration, Embodied Cognitive Science, Cognitive Modeling
Abstract: Humans develop Theory of Mind (ToM) at a young age - the ability to understand that others have intents, beliefs, knowledge, skills, etc. that may differ from our own. Modeling others� mental states plays an important role in human-human communication and collaborative tasks. As a new generation of cognitive robots start to enter our lives, it�s important for these robots to have similar ToM abilities in order to effectively collaborate with humans. While there is an increasing amount of work in ToM modeling for collaborative tasks in human-agent collaboration, most of the works were situated in a simulated environment. In this work, we take an initial step towards ToM modeling powered by large language models GPT-4 in human-robot communication and collaboration. In particular, we applied prompt engineering in a one-shot setting to empower the robot the ability to infer human�s intention and generate corresponding responses.
|
| |
| 15:30-17:00, Paper MoBIP-20.29 | Add to My Program |
| A Trust-Based Robot Autonomy Framework to Improve Human-Robot Collaboration Productivity for Future Smart Manufacturing |
|
| Wang, Weitian | Montclair State University |
Keywords: Human-Robot Collaboration, Intelligent and Flexible Manufacturing
Abstract: Recently, collaborative robots have been making a revolutionary shift from the traditional robot-in-cage model to the human-robot-partnership model in emerging working contexts such as smart manufacturing. Although tremendous research efforts have been paid to diverse technologies to enhance robots� working efficiency, in the light of multiple studies, humans� trust and acceptance of robots still need to be improved when robots coexist with humans in collaborative contexts. This LBR presents a new trust-based robot autonomy framework for future human-robot partnerships to improve collaboration productivity in smart manufacturing contexts. The proposed framework is described, and preliminary results are shown in this work.
|
| |
| 15:30-17:00, Paper MoBIP-20.30 | Add to My Program |
| Mitigating Human Uncertainties in Human-Robot Collaborative Transportation with Whole-Body Dynamics |
|
| Mahmud, Al Jaber | George Mason University |
| Nguyen, Duc | George Mason University |
| Xiao, Xuesu | George Mason University |
| Wang, Xuan | George Mason University |
Keywords: Human-Robot Collaboration, Optimization and Optimal Control, Human-Aware Motion Planning
Abstract: Collaborative human-robot systems can significantly reduce human workloads. One frequently encountered task in engineering settings is object transportation. To employ a human and a mobile robotic arm to perform co-transportation, the key challenges arise from the uncertainties of human behaviors, which may not adhere strictly to optimal trajectories, and the increased control complexity due to the coupling of the robotic arm and its mobile base. To address these issues, our goal is to develop a new control scheme that can efficiently compensate for human uncertainties. The approach offers two benefits: (i) the controller leverages the whole-body dynamics of the robot to achieve better end-effector mobility, rather than performing separate control on the robot arm and the mobile base; (ii) the robot's pose is informed by the specific ways in which humans cause uncertainties, instead of compensating them passively.
|
| |
| 15:30-17:00, Paper MoBIP-20.31 | Add to My Program |
| Hierarchical Robot Planning and Abstraction of Shared Autonomy |
|
| Yousefi, Ehsan | McGill University |
| Chen, Mo | Simon Fraser University |
| Sharf, Inna | McGill University |
Keywords: Human-Robot Collaboration, Task and Motion Planning, AI-Based Methods
Abstract: Operating an articulated machine involves a hierarchy of complex tasks, ranging from strategical route planning to low-level controls, and it is highly intertwined with the specific requirements of the application domain. In this work, we propose a novel shared autonomy framework to operate articulated robots by a human jointly with an autonomous agent. We provide strategies to design both task-oriented hierarchical planning and policy shaping algorithms for efficient human-robot interactions, for context-aware operation of articulated robots. Our framework for interplay between the human and the autonomy, as the participating agents in the system, is particularly influenced by the ideas from multi-agent systems, game theory, and theory of mind for a sliding level of autonomy. Figure 1 presents a graphical model of decision making in the proposed shared autonomy architecture: it includes three building blocks which in turn reflect the three major novelties and contributions of this work: 1. We provide a novel formulation of the uncertainty-aware sequential hierarchical human-in-the-loop decision making process by extending MDPs and Options framework to shared autonomy for task-oriented robot planning (Figure 1a). 2. To fine-tune the formulation to a human, we setup a novel conditional Variational Auto-Encoder (cVAE) architecture, where we use history of the system states, human actions, and their error with respect to a surrogate optimal model to encode human's internal state embeddings, beyond the designed values (Figure 1b). 3. We bring pre-training related state variables to represent prior training and knowledge (Figure 1c). To showcase the success of our framework, we fine-tune our framework for the operation of a feller-buncher articulated machine employed in timber harvesting industry. Currently, these machines are operated by humans and involve a series of physically and mentally arduous tasks, further exasperated by harsh environmental conditions. With intricate know-how of the tasks, a novel, human-inspired path planning algorithm to encode the sequence of decisions/actions in the operation of the machine�s crane---the articulated arm---to fell and bunch trees is proposed. We use this case study as our test-bed to train and test different policies. In training the policies, we employ deep RL techniques. In presenting results, we consider a number of scenarios and cases of importance to a shared autonomy framework. First, we use a pre-trained fully autonomous policy as the surrogate optimal model. By gathering actual human trials data, we were able to train a cVAE model to access a human's internal embeddings. We assessed the success of our novel platform by forming certain hypotheses regarding the effects of our designed structure and variables. In testing the trained shared autonomy policy, we looked at the performance of the model in interacting with human agents with different skill levels and degree of cooperativeness.
|
| |
| 15:30-17:00, Paper MoBIP-20.32 | Add to My Program |
| Building Human-Robot Team Situation Awareness |
|
| Ali, Arsha | University of Michigan |
| Robert, Lionel | University of Michigan |
| Tilbury, Dawn | University of Michigan |
Keywords: Human-Robot Teaming, Human-Robot Collaboration, Human Factors and Human-in-the-Loop
Abstract: Situation awareness is important for decision making and performance, yet team situation awareness has been less examined, especially in human-robot teams. Shared mental models and communication may influence team situation awareness in human-robot teams. We present preliminary results from a between-subjects experimental design that manipulates the shared mental model and communication amount in a team of one human and two unmanned ground robots to further investigate team situation awareness.
|
| |
| MoSP1 Keynote session, Hall D |
Add to My Program |
| Keynote M1 - Sven Behnke |
|
| |
| Chair: Ramirez-Amaro, Karinne | Chalmers University of Technology |
| |
| 17:00-18:00, Paper MoSP1.1 | Add to My Program |
| From Intuitive Immersive Telepresence Systems to Conscious Service Robots |
|
| Behnke, Sven | University of Bonn |
Keywords: Humanoid Robot Systems, Telerobotics and Teleoperation
Abstract: Intuitive immersive telepresence systems enable
transporting human presence to remote locations in real
time. The participants of the recent ANA Avatar XPRIZE
competition developed robotic systems that allow operators
to see, hear, and interact with a remote environment in a
way that feels as if they are truly there. In the keynote,
I will present the competition tasks and results. My team
NimbRo won the 5M Grand Prize. I will detail our
approaches for the design of the operator station, the
avatar robot, and the software. While telepresence enables
a multitude of applications such as telemedicine and remote
assistance, for other scenarios autonomy is required. In
the second part of my keynote, I argue that consciousness
is needed to adapt quickly to novel tasks in open-ended
domains and to be aware of own limitations. I will present
a research concept for developing conscious service robots
that systematically generalize their knowledge to cope with
novelty and monitor themselves to obtain more information
when needed, to avoid risks, and to detect and mitigate
errors. This new generation of robots has much potential
for numerous open-ended application domains, including
assistance in everyday environments. Prof. Dr. Sven Behnke holds since 2008 the chair for
Autonomous Intelligent Systems at the University of Bonn,
Germany, and heads there the Computer Science Institute VI
� Intelligent Systems and Robotics. He graduated in 1997
from Martin-Luther-Universit�t Halle-Wittenberg
(Dipl.-Inform.) and received his doctorate in computer
science (Dr. rer. nat.) from Freie Universit�t Berlin in
2002. In 2003 he did postdoctoral research on robust speech
recognition at the International Computer Science Institute
in Berkeley, CA. In 2004-2008 Professor Behnke led the Emmy
Noether Junior Research Group �Humanoid Robots� at
Albert-Ludwigs-Universit�t Freiburg. His research interests
include cognitive robotics, computer vision, and machine
learning. Prof. Behnke received several Best Paper Awards,
three Amazon Research Awards (2018-20), a Google Faculty
Research Award (2019), and the Ralf-Dahrendorf- Prize of
BMBF for the European Research Area (2019). His team NimbRo
has won numerous robot competitions (RoboCup Humanoid
Soccer, RoboCup@Home, MBZIRC, ANA Avatar XPRIZE).
|
| |
| MoSP2 Keynote session, Grand Ballroom A |
Add to My Program |
| Keynote M2 - Michelle Johnson |
|
| |
| Chair: Leite, Iolanda | KTH Royal Institute of Technology |
| |
| 17:00-18:00, Paper MoSP2.1 | Add to My Program |
| Towards More Inclusive Rehabilitation Robots |
|
| Johnson, Michelle J. | University of Pennsylvania |
Keywords: Rehabilitation Robotics
Abstract: Michelle J. Johnson, Ph.D., is currently Associate
professor of Physical Medicine and Rehabilitation at the
University of Pennsylvania. She has secondary appointments
as an Associate professor in Bioengineering and in
Mechanical Engineering and Applied Mechanics. She has a
Bachelor of Science in Mechanical Engineering and Applied
Mechanics from the University of Pennsylvania and a PhD in
Mechanical Engineering, with an emphasis in mechatronics,
robotics, and design, from Stanford University. She
completed a NSF-NATO post-doctoral fellowship at the
Advanced Robotics Technology and Systems Laboratory at the
Scuola Superiore Sant�Anna in Italy. She directs the
Rehabilitation Robotic Research and Design Laboratory
located at the Pennsylvania Institute of Rehabilitation
Medicine at the University of Pennsylvania, School of
Medicine. The lab is also affiliated with the General
Robotics Automation Sensing Perception (GRASP) Lab. Dr.
Johnson�s lab specializes in the design, development, and
therapeutic use of novel, affordable, intelligent robotic
assistants for rehabilitation in high and low-resource
environments with an emphasis on using robotics and sensors
to quantify upper limb motor function in adults and
children with brain injury or at risk for brain injury.
Dr. Johnson has spent over twenty years applying technology
solutions to aid in the understanding of disability and
impairment after brain injury. She is currently a Fulbright
Scholar for 2020-2022 to Botswana and an IEEE Engineering
in Biology and Medicine Society Distinguished Lecturer
2021-2022.
|
| |
| MoSP3 Keynote session, Grand Ballroom B |
Add to My Program |
| Keynote M3 - Rebecca Kramer-Bottiglio |
|
| |
| Chair: Coad, Margaret M. | University of Notre Dame |
| |
| 17:00-18:00, Paper MoSP3.1 | Add to My Program |
| Shape-Shifting Soft Robots That Adapt to Changing Tasks and Environments |
|
| Kramer-Bottiglio, Rebecca | Yale University |
Keywords: Soft Robot Applications, Soft Robot Materials and Design
Abstract: Soft robots have the potential to augment their morphology,
properties, and behavioral control policies to adapt to
different tasks and environments. Inspired by the dynamic
plasticity and general adaptability of animals, this talk
will introduce several shape-shifting soft robot
platforms�for example, robotic fabrics and legged robots
with morphing limbs�capable of editing their physical
structure to perform tasks more efficiently under changing
task demands or in multiple environments. The talk will
also cover the multifunctional material components�for
example, stretchable electronics, soft actuators, and
variable stiffness materials�that enable predictable shape
change. Harnessing these engineered materials and
mechanisms yields access to a gamut of adaptive
capabilities for increasingly life-like adaptive robots. Rebecca Kramer-Bottiglio is the John J. Lee Associate
Professor of Mechanical Engineering and Materials Science
at Yale University. Focusing on the intersection of
materials, manufacturing, and robotics, her group is
deriving new multifunctional materials that will allow
next-generation robots to adapt their morphology and
behavior to changing tasks and environments. She is the
winner of multiple early career awards including the NSF
Career Award, the NASA Early Career Award, the AFOSR Young
Investigator Award, and the ONR Young Investigator Award.
She was named to the Forbes �30 under 30� list for her
approach to manufacturing liquid metals through printable
emulsions and scalable sintering methods. She received the
Presidential Early Career Award for Scientists and
Engineers (PECASE) award, the highest honor bestowed by the
U.S. government on outstanding scientists and engineers
beginning their independent careers, for her development of
robotic skins that turn inanimate objects into
multifunctional robots. She serves as an Associate Editor
of Soft Robotics and IEEE T-RO, as well as a Senior Editor
of IJRR, and was General Chair of the IEEE International
Conference on Soft Robotics (RoboSoft) in 2020 and 2021.
She was named an IEEE Distinguished Lecturer in 2019, a
National Academy of Engineering (NAE) Gilbreth Lecturer in
2022, and a National Academy of Science (NAS) Kavli Fellow
in 2023. She also serves on the Technology, Innovation &
Engineering Committee of the NASA Advisory Council.
|
| |